Awesome Multimodal Generative Modeling Papers and Source Codes

Lumina-Image 2.0: High-Quality, Efficient Text-to-Image Generation with Unified Architecture and Strong Open-Source Support 805

Lumina-Image 2.0 is a state-of-the-art open-source text-to-image (T2I) generation framework that delivers exceptional visual fidelity and prompt adherence while maintaining…

01/09/2026Controllable Image Synthesis, Multimodal Generative Modeling, Text-to-Image Generation

HiDream-I1: Generate and Edit High-Quality Images in Seconds with Sparse Diffusion Transformer 777

The rapid evolution of AI-driven image generation has unlocked incredible creative potential—but often at a steep cost: slow inference, massive…

01/05/2026Instruction-based Image Editing, Multimodal Generative Modeling, Text-to-Image Generation

Waver: Generate Lifelike, High-Motion Videos in 1080p with One Unified Model 588

In the rapidly evolving world of generative AI, video generation has remained a particularly challenging frontier—especially when it comes to…

01/05/2026Image-to-Video Synthesis, Multimodal Generative Modeling, Text-to-Video Generation