awesome-conditional-content-generation
This repository contains a collection of resources and papers on Conditional Content Generation. Especially for human motion generation, image generation and video generation. This repo is maintained by Haofan Wang.
如果你对可控内容生成(2D/3D)方向感兴趣,希望与我保持更广泛的学术合作或寻求一份实习,并且已经发表过至少一篇顶会论文,欢迎随时发邮件至haofanwang.ai@gmail.com,高校、工业界均欢迎。
Contents
- Papers List
- Tracking Papers on Diffusion Models
- Conditional Human Motion Generation
- Conditional Image Generation
- Conditional Video Generation
Papers
Music-Driven motion generation
Taming Diffusion Models for Music-driven Conducting Motion Generation
NUS, AAAI 2023 Summer Symposium, [Code]
Music-Driven Group Choreography
AIOZ AI, CVPR'23
Discrete Contrastive Diffusion for Cross-Modal and Conditional Generation
Illinois Institute of Technology, ICLR'23, [Code]
Magic: Multi Art Genre Intelligent Choreography Dataset and Network for 3D Dance Generation
Tsinghua University, 7 Dec 2022
Pretrained Diffusion Models for Unified Human Motion Synthesis
DAMO Academy, Alibaba Group, 6 Dec 2022
EDGE: Editable Dance Generation From Music
Stanford University, 19 Nov 2022
You Never Stop Dancing: Non-freezing Dance Generation via Bank-constrained Manifold Projection
MSRA, NeurIPS'22
GroupDancer: Music to Multi-People Dance Synthesis with Style Collaboration
Tsinghua University, ACMMM'22
A Brand New Dance Partner: Music-Conditioned Pluralistic Dancing Controlled by Multiple Dance Genres
Yonsei University, CVPR 2022, [Code]
Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory
NTU, CVPR 2022 (Oral), [Code]
Dance Style Transfer with Cross-modal Transformer
KTH, 22 Aug 2022, [Upcoming Code]
Music-driven Dance Regeneration with Controllable Key Pose Constraints
Tencent, 8 July 2022
AI Choreographer: Music Conditioned 3D Dance Generation with AIST++
USC, ICCV 2021, [Code]
Text-Driven motion generation
ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model
NTU, CVPR'23, [Code]
TEMOS: Generating diverse human motions from textual descriptions
ENPC, CVPR'23
GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents
Peking University, CVPR'23
Human Motion Diffusion as a Generative Prior
Anonymous Authors, [Code]
T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations
Tencent AI Lab, 16 Jan 2023, [Code]
Modiff: Action-Conditioned 3D Motion Generation with Denoising Diffusion Probabilistic Models
Beihang University, 10 Jan 2023
Executing your Commands via Motion Diffusion in Latent Space
Tencent, 8 Dec 2022, [Code]
MultiAct: Long-Term 3D Human Motion Generation from Multiple Action Labels
Seoul National University, AAAI 2023 Oral, [Code]
MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis
Max Planck Institute for Informatics, 8 Dec 2022
Executing your Commands via Motion Diffusion in Latent Space
Tencent PCG, 8 Dec 2022, [Upcoming Code]
UDE: A Unified Driving Engine for Human Motion Generation
Xiaobing Inc, 29 Nov 2022, [Upcoming Code]
MotionBERT: Unified Pretraining for Human Motion Analysis
SenseTime Research, 12 Oct 2022, [Code]
Human Motion Diffusion Model
Tel Aviv University, 3 Oct 2022, [Code]
FLAME: Free-form Language-based Motion Synthesis & Editing
Korea University, 1 Sep 2022
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
NTU, 22 Aug 2022, [Code]
TEMOS: Generating diverse human motions from textual descriptions
MPI, ECCV 2022 (Oral), [Code]
GIMO: Gaze-Informed Human Motion Prediction in Context
Stanford University, ECCV 2022, [Code]
MotionCLIP: Exposing Human Motion Generation to CLIP Space
Tel Aviv University, ECCV 2022, [Code]
Generating Diverse and Natural 3D Human Motions from Text
University of Alberta, CVPR 2022, [Code]
AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars
NTU, SIGGRAPH 2022, [Code]
Text2Gestures: A Transformer-Based Network for Generating Emotive Body Gestures for Virtual Agents
University of Maryland,, VR 2021, [Code]
Audio-Driven motion generation
For more recent paper, you can find from here
Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
NTU, CVPR'23, [Code]
GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis
Zhejiang University, ICLR'23, [Code]
DiffMotion: Speech-Driven Gesture Synthesis Using Denoising Diffusion Model
Macau University of Science and Technolog, 24 Jan 2023
DiffTalk: Crafting Diffusion Models for Generalized Talking Head Synthesis
Tsinghua University, 10 Jan 2023
Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation
University of Wrocław, 6 Jan 2023, [Incoming Code]
Generating Holistic 3D Human Motion from Speech
Max Planck Institute for Intelligent Systems, 8 Dev 2022
Audio-Driven Co-Speech Gesture Video Generation
NTU, 5 Dec 2022
Listen, denoise, action! Audio-driven motion synthesis with diffusion models
KTH Royal Institute of Technology, 17 Nov 2022
ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech
York University, 23 Sep 2022, [Code]
BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis
The University of Tokyo, ECCV 2022, [Code]
EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model
Nanjing University, SIGGRAPH 2022, [Code]
Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation
The Chinese University of Hong Kong, CVPR 2022, [Code]
SEEG: Semantic Energized Co-speech Gesture Generation
Alibaba DAMO Academy, CVPR 2022, [Code]
FaceFormer: Speech-Driven 3D Facial Animation with Transformers
The University of Hong Kong, CVPR 2022, [Code]
Freeform Body Motion Generation from Speech
JD AI Research, 4 Mar 2022, [Code]
Audio2Gestures: Generating Diverse Gestures from Speech Audio with Conditional Variational Autoencoders
Tencent AI Lab, ICCV 2021, [Code]
Learning Speech-driven 3D Conversational Gestures from Video
Max Planck Institute for Informatics, IVA 2021, [Code]
Learning Individual Styles of Conversational Gesture
UC Berkeley, CVPR 2019, [Code]
Human motion prediction
For more recent more, you can find from here
InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion
UIUC, ICCV 2023, [Code]
Stochastic Multi-Person 3D Motion Forecasting
UIUC, ICLR 2023 (Spotlight), [Code]
HumanMAC: Masked Motion Completion for Human Motion Prediction
Tsinghua University, ICCV 2023, [Code]
BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction
University of Barcelona, 25 Nov 2022, [Upcoming Code]
Diverse Human Motion Prediction Guided by Multi-Level Spatial-Temporal Anchors
UIUC, ECCV 2022 (Oral), [Code]
PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting
NAVER LABS, ECCV'2022, [Code]
NeMF: Neural Motion Fields for Kinematic Animation
Yale University, NeurIPS 2022 (Spotlight), [Code]
Multi-Person Extreme Motion Prediction
Inria University, CVPR 2022, [Code]
MotionMixer: MLP-based 3D Human Body Pose Forecasting
Mercedes-Benz, IJCAI 2022 (Oral), [Code]
Multi-Person 3D Motion Prediction with Multi-Range Transformers
UCSD, NeurIPS 2021
Motion Applications
MIME: Human-Aware 3D Scene Generation
MPI
Scene Synthesis from Human Motion
Stanford University, SIGGRAPH Asia 2022, [Code]
TEACH: Temporal Action Compositions for 3D Humans
MPI, 3DV 2022,