Awesome Text-to-Video Generation Papers and Source Codes

ControlVideo: Training-Free, Controllable Text-to-Video Generation with Consistent Motion and Structure 851

Generating high-quality videos from text has long been a challenging frontier in generative AI—especially compared to the rapid advances in…

01/13/2026Controllable Video Synthesis, Structure-conditioned Generative Models, Text-to-Video Generation

LongLive: Real-Time Interactive Long Video Generation with Seamless Prompt Control 656

Creating long, coherent, and high-quality videos from text has long been a formidable challenge in generative AI. Existing approaches—especially diffusion-based…

01/09/2026Interactive Video Synthesis, Long-form Video Generation, Text-to-Video Generation

Waver: Generate Lifelike, High-Motion Videos in 1080p with One Unified Model 588

In the rapidly evolving world of generative AI, video generation has remained a particularly challenging frontier—especially when it comes to…

01/05/2026Image-to-Video Synthesis, Multimodal Generative Modeling, Text-to-Video Generation

PUSA: Generate High-Quality Video from Text or Images for $500—Not $100,000 645

Video generation has long been bottlenecked by two stubborn realities: astronomical training costs and rigid temporal modeling. Most state-of-the-art image-to-video…

01/05/2026Image-to-Video Synthesis, Multi-condition Video Diffusion, Text-to-Video Generation

TurboDiffusion: Generate High-Quality AI Videos in Seconds Instead of Minutes on a Single GPU 1449

Video generation using diffusion models has long suffered from a crippling bottleneck: speed. Even the most advanced models can take…

01/04/2026Image-to-Video Synthesis, Text-to-Video Generation, Video Diffusion Acceleration

Step-Video-T2V: Generate High-Quality, Long-Form Videos from Text in English and Chinese 3139

Step-Video-T2V is a state-of-the-art open-source text-to-video foundation model developed by StepFun AI. With 30 billion parameters and the ability to…

12/27/2025Multimodal Foundation Models, Text-to-Video Generation, Video Diffusion Models

Show-1: High-Quality, Efficient Text-to-Video Generation with Precise Prompt Alignment 1133

Text-to-video generation has rapidly evolved, yet technical teams still face a persistent trade-off: high-quality outputs often come at prohibitive computational…

12/22/2025Diffusion Models, Text-to-Video Generation, Video Synthesis

Open-Sora Plan: Open-Source High-Quality Long Video Generation for Real-World Applications 12044

Open-Sora Plan is an open-source initiative designed to democratize access to state-of-the-art video generation capabilities. Inspired by the promise of…

12/22/2025Image-to-Video Synthesis, Open-source Video Diffusion Models, Text-to-Video Generation