A Collection of Video Generation Studies
This GitHub repository summarizes papers and resources related to the video generation task.
If you have any suggestions about this repository, please feel free to start a new issue or pull requests.
Recent news of this GitHub repo are listed as follows.
🔥 Click to see more information.
- [Jun. 17th] All NeurIPS 2023 papers and references are updated.
- [Apr. 26th] Update a new direction: Personalized Video Generation.
- [Mar. 28th] The official AAAI 2024 paper list are released! Official version of PDFs and BibTeX references are updated accordingly.
Contents
To-Do Lists
- Latest Papers
- Update ECCV 2024 Papers
- Update CVPR 2024 Papers
- Update PDFs and References of ⚠️ Papers
- Update Published Versions of References
- Update AAAI 2024 Papers
- Update PDFs and References of ⚠️ Papers
- Update Published Versions of References
- Update ICLR 2024 Papers
- Update NeurIPS 2023 Papers
- Previously Published Papers
- Update Previous CVPR papers
- Update Previous ICCV papers
- Update Previous ECCV papers
- Update Previous NeurIPS papers
- Update Previous ICLR papers
- Update Previous AAAI papers
- Update Previous ACM MM papers
- Regular Maintenance of Preprint arXiv Papers and Missed Papers
Products
Name | Organization | Year | Research Paper | Website | Specialties |
---|---|---|---|---|---|
Sora | OpenAI | 2024 | link | link | - |
Lumiere | 2024 | link | link | - | |
VideoPoet | 2023 | - | link | - | |
W.A.I.T | 2023 | link | link | - | |
Gen-2 | Runaway | 2023 | - | link | - |
Gen-1 | Runaway | 2023 | - | link | - |
Animate Anyone | Alibaba | 2023 | link | link | - |
Outfit Anyone | Alibaba | 2023 | - | link | - |
Stable Video | StabilityAI | 2023 | link | link | - |
Pixeling | HiDream.ai | 2023 | - | link | - |
DomoAI | DomoAI | 2023 | - | link | - |
Emu | Meta | 2023 | link | link | - |
Genmo | Genmo | 2023 | - | link | - |
NeverEnds | NeverEnds | 2023 | - | link | - |
Moonvalley | Moonvalley | 2023 | - | link | - |
Morph Studio | Morph | 2023 | - | link | - |
Pika | Pika | 2023 | - | link | - |
PixelDance | ByteDance | 2023 | link | link | - |
Papers
Survey Papers
- Year 2024
- arXiv
- Video Diffusion Models: A Survey [Paper]
- Year 2023
- arXiv
- A Survey on Video Diffusion Models [Paper]
Text-to-Video Generation
- Year 2024
- CVPR
- Vlogger: Make Your Dream A Vlog [Paper] [Code]
- Make Pixels Dance: High-Dynamic Video Generation [Paper] [Project] [Demo]
- VGen: Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation [Paper] [Code] [Project]
- GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation [Paper] [Project]
- SimDA: Simple Diffusion Adapter for Efficient Video Generation [Paper] [Code] [Project]
- MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation [Paper] [Project] [Video]
- Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models [Paper] [Project]
- PEEKABOO: Interactive Video Generation via Masked-Diffusion [Paper] [Code] [Project] [Demo]
- EvalCrafter: Benchmarking and Evaluating Large Video Generation Models [Paper] [Code] [Project]
- A Recipe for Scaling up Text-to-Video Generation with Text-free Videos [Paper] [Code] [Project]
- BIVDiff: A Training-free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models [Paper] [Project]
- Mind the Time: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis [Paper] [Project]
- Animate Anyone: Consistent and Controllable Image-to-video Synthesis for Character Animation [Paper] [Code] [Project]
- MotionDirector: Motion Customization of Text-to-Video Diffusion Models [Paper] [Code]
- Hierarchical Patch-wise Diffusion Models for High-Resolution Video Generation [Paper] [Project]
- DiffPerformer: Iterative Learning of Consistent Latent Guidance for Diffusion-based Human Video Generation [Paper] [Code]
- Grid Diffusion Models for Text-to-Video Generation [Paper] [Code] [Video]
- ICLR
- AAAI
- Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos [Paper] [Code] [Project]
- E2HQV: High-Quality Video Generation from Event Camera via Theory-Inspired Model-Aided Deep Learning [Paper]
- ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation [Paper] [Code] [Project]
- F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text to-Video Synthesis [Paper]
- arXiv
- Lumiere: A Space-Time Diffusion Model for Video Generation [Paper] [Project]
- Boximator: Generating Rich and Controllable Motions for Video Synthesis [Paper] [Project] [Video]
- World Model on Million-Length Video And Language With RingAttention [Paper] [Code] [Project]
- Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion [Paper] [Project]
- WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens [Paper] [Code] [Project]
- MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation [Paper] [Project]
- Latte: Latent Diffusion Transformer for Video Generation [Paper] [Code] [Project]
- Mora: Enabling Generalist Video Generation via A Multi-Agent Framework [Paper] [Code]
- StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text [Paper] [Code] [Project] [Video]
- VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models [Paper]
- StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation [Paper] [Code] [Project] [Demo]
- Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model [Paper] [Code] [Project]
- Others
- Sora: Video Generation Models as World Simulators [Paper]
- CVPR
- Year 2023
- CVPR
- Align your Latents: High-resolution Video Synthesis with Latent Diffusion Models [Paper] [Project] [Reproduced code]
- Text2Video-Zero: Text-to-image Diffusion Models are Zero-shot Video Generators
- CVPR