DigiHuman

基于摄像头输入的3D角色全身动画生成系统

DigiHuman是一个开源的人工智能项目，通过摄像头输入自动生成3D角色模型的全身和面部动画。系统结合MediaPipe生成3D关键点和Unity3D渲染技术，实现了全身动作、面部表情等多种动画效果。支持多个混合形状动画、视频导出等功能，为3D虚拟角色动画制作提供了自动化解决方案。该项目支持多种3D模型类型，并优化了动画平滑度，为动画创作者和研究人员提供了创新工具和无限可能。

DigiHuman3D动画生成姿态估计面部动画Unity3DGithub开源项目

访问官网

GitHub

论文

文档

DigiHuman

Digihuman is a project which aims to automatically generate <b>whole body pose animation + facial animation</b> on 3D Character models based on the camera input. <br/> This project is my B.Sc thesis of Computer Engineering at Amirkabir University of Technology(AUT).

About DigiHuman

DigiHuman is a system for bringing automation in animation generation on 3D virtual characters. It uses Pose estimation and facial landmark generator models to create entire body and face animation on 3D virtual characters. <br/> DigiHuman is developed with MediaPipe and Unity3D. MediaPipe generates 3D landmarks for the human whole body and face, and Unity3D is used to render the final animation after processing the generated landmarks from MediaPipe. The diagram below, shows the whole architucture of the application.

Sample Outputs of the project

<div align="center"> <a href="https://youtu.be/maUUXfe_EcU">Project demo</a> | <a href="https://youtu.be/L62w5AMaFOk">Tutorial</a> </div>

Hands animations

Full body animation

Face animation

Installation

Follow the instructions to run the program!

Backend server installtion

Install MediaPipe python.

 pip install mediapipe

Install OpenCV python.

 pip install opencv-python

Go to backend directory and install other requirements:

 pip install -r requirements.txt

You'll need to download the pre-trained generator model for the COCO dataset and place it into backend/checkpoints/coco_pretrained/.

Unity3D Installation

Install Unity3D and its requirements by the following guidelines(Skip 1-3 if Unity3D is already installed).

Download and install UnityHub
Add a new license in UnityHub and register it
Install a Unity Editor inside UnityHub(LTS versions and a version higher than 2020.3.25f1 are recommended).
In the Unity project setting, allow HTTP connections in the player setting.

Download and import the following packages into your project to enable the recording option available with FFmpeg(Download .unitypackage files and drag them to your project).

FFmpegOut package (MIT license)
FFmpegOutBinaries package (GPL)

Usage

Run backend server at backend directory with the following command:
```
 python server.py
```
Run Unity Project and open the main scene at Assets\Scenes\MainScene.unity
Test the program by uploading videos to backend from the Unity project(You can test the application by selecting provided animations from the right side menu!).

Adding new 3D characters

You can add your characters to the project! Characters should have a standard Humanoid rig to show kinematic animations. For rendering face animations, characters should have a facial rig(Blendmesh).</br> Follow these steps to add your character:

Find a 3D character model from Unity asset store or download a free one(You can download them from websites like Mixamo).
Open the character setting and set the rig to humanoid

Drag and drop your 3D character model to CharacterChooser/CharacterSlideshow/Parent object in Unity main Scene like the image below

Add BlendShapeController and QualityData components to the character object in the scene(which is dragged inside the Parent object in the last step).
Set BlendShapeController values

Add character SkinnedMeshRenderer component to BlendShapeController component.

Find each blnedShape weight number under SkinnedMeshRenderer and set those numbers in BlendShapes field inside BlendShapeController (for specifying each blendshape value to the BlendShapeController component so the animation would be shown on character face by modification on these blnedShape values)

Open CharacterSlideshow Object on CharacterChooser/CharacterSlideshow path inside the scene hierarchy, then add a new dragged character to the nodes property(all characters should be referenced inside nodes).

Run the application and you can now select your character for rendering animation!

Features

Making full body animation
Animating multiple blendShapes on 3D character (up to 40 blendshape animations is supported currently)
Supporting any 3D models with Humanoid T-Pose rig
Exporting animation in a video file
Saving animation data and re-rendering it for future usage
Filtering mediaPipe outputs in order to detect and remove noises and better smoothness (Low Pass Filtering is used currently)

Animating the character's face in great details
- Training a regression model to generate Blendmesh weights by feeding the output data of mediaPipe FaceMesh(468 points)
- Using StyleGan techniques to replace whole character face mesh
Automatic rigging for 3D models without humanoid rig (Using deep neural network models like RigNet)
Generating complete character mesh automatically using models like PIFuHD (in progress!)
Animating 3D character mouth in great detail using audio signal or natural language processing methods
Generating complete environment in 3D

Resources

Body Pose Estimation: BlazePose model
- Paper: BlazePose: On-device Real-time Body Pose Tracking
Hands Pose Estimation: MediaPipe Hands model
- Paper: MediaPipe Hands: On-device Real-time Hand Tracking
Face Detection: BlazeFace model
- Paper: BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs
Face Landmark Generator: MediaPipe Face Landmark Model
- Paper: Real-time Facial Surface Geometry from Monocular Video on Mobile GPUs

Licenses & Citations

DigiHuman Licence

Application License: GPL-3.0 license Non-commercial use only. If you distribute or communicate copies of the modified or unmodified Program, or any portion thereof, you must provide appropriate credit to Danial Kordmodanlou as the original author of the Program. This attribution should be included in any location where the Program is used or displayed.

FFmpeg</br>

FFmpeg is licensed under the GNU Lesser General Public License (LGPL) version 2.1 or later. However, FFmpeg incorporates several optional parts and optimizations that are covered by the GNU General Public License (GPL) version 2 or later. If those parts get used the GPL applies to all of FFmpeg.
Unity FFmpeg packages are licensed under Keijiro Takahashi MIT

GauGan

Used SPADE repository developed by NVIDIA and the customization is addapted from Smart-Sketch with GPL V 3.0 licence

@inproceedings{park2019SPADE,
  title={Semantic Image Synthesis with Spatially-Adaptive Normalization},
  author={Park, Taesung and Liu, Ming-Yu and Wang, Ting-Chun and Zhu, Jun-Yan},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2019}
}

3D Characters

Unity-chan model & mixamo models

Contact

Danial Kordmodanlou - kordmodanloo@gmail.com

Website : danial-kord.github.io

Project Link: github.com/Danial-Kord/DigiHuman

Telegram ID:

编辑推荐精选

酷表ChatExcel

大模型驱动的Excel数据处理工具

基于大模型交互的表格处理系统，允许用户通过对话方式完成数据整理和可视化分析。系统采用机器学习算法解析用户指令，自动执行排序、公式计算和数据透视等操作，支持多种文件格式导入导出。数据处理响应速度保持在0.8秒以内，支持超过100万行数据的即时分析。

AI工具酷表ChatExcelAI智能客服AI营销产品使用教程

DeepEP

DeepSeek开源的专家并行通信优化框架

DeepEP是一个专为大规模分布式计算设计的通信库，重点解决专家并行模式中的通信瓶颈问题。其核心架构采用分层拓扑感知技术，能够自动识别节点间物理连接关系，优化数据传输路径。通过实现动态路由选择与负载均衡机制，系统在千卡级计算集群中维持稳定的低延迟特性，同时兼容主流深度学习框架的通信接口。

DeepSeek

全球领先开源大模型，高效智能助手

DeepSeek是一家幻方量化创办的专注于通用人工智能的中国科技公司，主攻大模型研发与应用。DeepSeek-R1是开源的推理模型，擅长处理复杂任务且可免费商用。

问小白

DeepSeek R1 满血模型上线

问小白是一个基于 DeepSeek R1 模型的智能对话平台，专为用户提供高效、贴心的对话体验。实时在线，支持深度思考和联网搜索。免费不限次数，帮用户写作、创作、分析和规划，各种任务随时完成！

AI主流办公工具有哪些办公热门AI 助手

KnowS

AI医学搜索引擎整合4000万+实时更新的全球医学文献

医学领域专用搜索引擎整合4000万+实时更新的全球医学文献，通过自主研发AI模型实现精准知识检索。系统每日更新指南、中英文文献及会议资料，搜索准确率较传统工具提升80%，同时将大模型幻觉率控制在8%以下。支持临床建议生成、文献深度解析、学术报告制作等全流程科研辅助，典型用户反馈显示每周可节省医疗工作者70%时间。

Windsurf Wave 3

Windsurf Editor推出第三次重大更新Wave 3

新增模型上下文协议支持与智能编辑功能。本次更新包含五项核心改进：支持接入MCP协议扩展工具生态，Tab键智能跳转提升编码效率，Turbo模式实现自动化终端操作，图片拖拽功能优化多模态交互，以及面向付费用户的个性化图标定制。系统同步集成DeepSeek、Gemini等新模型，并通过信用点数机制实现差异化的资源调配。

AI IDE

腾讯元宝

腾讯自研的混元大模型AI助手

腾讯元宝是腾讯基于自研的混元大模型推出的一款多功能AI应用，旨在通过人工智能技术提升用户在写作、绘画、翻译、编程、搜索、阅读总结等多个领域的工作与生活效率。

AI助手AI对话AI工具腾讯元宝智能体热门 AI 办公助手

Grok3

埃隆·马斯克旗下的人工智能公司 xAI 推出的第三代大规模语言模型

Grok3 是由埃隆·马斯克旗下的人工智能公司 xAI 推出的第三代大规模语言模型，常被马斯克称为“地球上最聪明的 AI”。它不仅是在前代产品 Grok 1 和 Grok 2 基础上的一次飞跃，还在多个关键技术上实现了创新突破。

OmniParser

帮助AI理解电脑屏幕纯视觉GUI元素的自动化解析方案

开源工具通过计算机视觉技术实现图形界面元素的智能识别与结构化处理，支持自动化测试脚本生成和辅助功能开发。项目采用模块化设计，提供API接口与多种输出格式，适用于跨平台应用场景。核心算法优化了元素定位精度，在动态界面和复杂布局场景下保持稳定解析能力。

OmniParser界面解析交互区域检测Github开源项目

流畅阅读

AI网页翻译插件双语阅读工具，还原母语级体验

流畅阅读是一款浏览器翻译插件，通过上下文智能分析提升翻译准确性，支持中英双语对照显示。集成多翻译引擎接口，允许用户自定义翻译规则和快捷键配置，操作数据全部存储在本地设备保障隐私安全。兼容Chrome、Edge、Firefox等主流浏览器，基于GPL-3.0开源协议开发，提供持续的功能迭代和社区支持。

AI翻译AI翻译引擎AI翻译工具

下拉加载更多