Awesome Text-to-Image Generation Papers and Source Codes

Liquid: One Unified Language Model for Text and Images—No CLIP, No Compromises 633

What if a single large language model (LLM) could both understand and generate high-quality images—without relying on external vision encoders…

01/13/2026Multimodal Generation, Text-to-Image Generation, Visual Understanding

Prompt-Free Diffusion: Generate Images Without Writing a Single Text Prompt 757

Text-to-image (T2I) diffusion models have revolutionized creative workflows—but they come with a hidden bottleneck: prompt engineering. Describing an image in…

01/13/2026Prompt-free Diffusion, Text-to-Image Generation, Visual-conditioned Image Synthesis

Uni-ControlNet: Unified Visual Control for Text-to-Image Generation Without Retraining Everything 664

Generating high-quality images from text prompts has become remarkably powerful thanks to diffusion models like Stable Diffusion. Yet, for many…

01/13/2026Controllable Diffusion Models, Multimodal Conditioning, Text-to-Image Generation

XVerse: Precise Multi-Subject Image Generation with Independent Identity and Attribute Control 603

Generating realistic images with multiple distinct subjects—each retaining their unique identity and visual attributes like pose, lighting, or clothing style—has…

01/09/2026Controllable Image Generation, Multi-subject Image Synthesis, Text-to-Image Generation

NextStep-1: High-Fidelity Autoregressive Image Generation Without Diffusion or Discrete Token Loss 553

Autoregressive (AR) models have long dominated natural language generation, but applying the same step-by-step prediction approach to images has been…

01/09/2026Autoregressive Modeling, Image Editing, Text-to-Image Generation

Lumina-Image 2.0: High-Quality, Efficient Text-to-Image Generation with Unified Architecture and Strong Open-Source Support 805

Lumina-Image 2.0 is a state-of-the-art open-source text-to-image (T2I) generation framework that delivers exceptional visual fidelity and prompt adherence while maintaining…

01/09/2026Controllable Image Synthesis, Multimodal Generative Modeling, Text-to-Image Generation

HiDream-I1: Generate and Edit High-Quality Images in Seconds with Sparse Diffusion Transformer

LyCORIS: Customize Stable Diffusion Without Retraining the Whole Model – Flexible, Lightweight Fine-Tuning for Text-to-Image Generation 2413

If you’re working with text-to-image models like Stable Diffusion, you’ve likely faced the trade-off between customization and efficiency. Full fine-tuning…

12/27/2025Model Customization, Parameter-Efficient Fine-Tuning, Text-to-Image Generation

Text-to-Image Generation