Large AI models—from language generators to video diffusion systems—are bottlenecked by the attention mechanism, whose computational cost scales quadratically with…
Video Generation
Sparse VideoGen2: Accelerate High-Quality Video Generation by 2x—Without Retraining Models 597
Generating high-fidelity videos with diffusion models has long been bottlenecked by computational inefficiency. Even on powerful GPUs, producing just a…
Sparse VideoGen2: Accelerate Video Diffusion Models 2.3x Without Retraining or Quality Loss 596
Video generation using diffusion transformers (DiTs) has reached remarkable visual fidelity—but at a steep computational cost. The quadratic complexity of…
Radial Attention: Generate 4× Longer Videos 3.7× Faster with O(n log n) Sparse Attention 519
Generating high-quality, long-form videos with diffusion models remains one of the most computationally demanding tasks in generative AI. Standard attention…
MAGI-1: Autoregressive Video Generation at Scale with Constant Memory and Real-Time Streaming 530
MAGI-1 is a breakthrough world model designed for autoregressive video generation at scale. Unlike conventional video diffusion or transformer-based approaches…
UniAnimate-DiT: High-Fidelity Human Animation from a Single Image and Pose Sequence – No Full Retraining Needed 797
Animating a static human image into a realistic, temporally coherent video used to require massive datasets, complex pipelines, or retraining…
FramePack: Generate Long, High-Quality Videos on a Laptop—Without Cloud Costs or Drifting Artifacts 16308
Creating long, coherent, and visually rich videos with AI has long been bottlenecked by computational complexity, memory constraints, and error…
VSA: Accelerate Video Diffusion Models by 2.5× with Trainable Sparse Attention—No Quality Tradeoff 2780
Video generation using diffusion transformers (DiTs) is rapidly advancing—but at a steep computational cost. Full 3D attention in these models…
Tora: Precisely Control Motion in AI-Generated Videos with Trajectory Guidance 1223
Creating videos with predictable, controllable motion has long been a major challenge in generative AI. While recent diffusion models produce…
StoryDiffusion: Generate Consistent Long-Form Visual Stories from Text Without Retraining Models 6351
Creating visually coherent sequences of images or videos from text prompts has long been a bottleneck in AI-powered storytelling. While…