Skip to content

PaperCodex

Subscribe

Text-to-Image Generation

Qwen-Image: Generate and Edit Images with Perfect Text—Even in Chinese

Qwen-Image: Generate and Edit Images with Perfect Text—Even in Chinese 6339

If you’ve ever struggled to generate marketing visuals with legible multilingual text—or tried to edit a product image only to…

12/26/2025Image Editing, Multimodal Text Rendering, Text-to-Image Generation
HunyuanImage-3.0: The Largest Open-Source Multimodal Image Generator with Native Reasoning and MoE Architecture

HunyuanImage-3.0: The Largest Open-Source Multimodal Image Generator with Native Reasoning and MoE Architecture 2562

HunyuanImage-3.0 is a groundbreaking open-source image generation model developed by Tencent. Unlike traditional diffusion-based approaches, it builds a native multimodal…

12/26/2025Mixture-of-Experts (MoE), Multimodal Reasoning, Text-to-Image Generation
Versatile Diffusion: One Unified Model for Text-to-Image, Image-to-Text, and Creative Variations

Versatile Diffusion: One Unified Model for Text-to-Image, Image-to-Text, and Creative Variations 1334

In today’s fast-evolving AI landscape, most generative systems are built for a single task—whether that’s turning text into images, editing…

12/26/2025Image-to-text Captioning, Multimodal Diffusion, Text-to-Image Generation
InstantStyle: Effortless, Tuning-Free Style Preservation for Text-to-Image Generation

InstantStyle: Effortless, Tuning-Free Style Preservation for Text-to-Image Generation 1969

InstantStyle is a breakthrough framework that enables high-fidelity, style-consistent image generation without requiring any model retraining or per-image tuning. Built…

12/19/2025Image Stylization, Style Transfer, Text-to-Image Generation
OmniGen: One Unified Model for All Image Generation Tasks—No Plugins, No Preprocessing, Just Prompts

OmniGen: One Unified Model for All Image Generation Tasks—No Plugins, No Preprocessing, Just Prompts 4282

Modern image generation is powerful—but fragmented. Depending on your goal—generating from text, editing existing images, preserving a person’s identity, or…

12/19/2025Image Editing, Subject-driven Generation, Text-to-Image Generation
Flow-GRPO: Boost Text-to-Image Accuracy with Online RL—Without Sacrificing Quality or Diversity

Flow-GRPO: Boost Text-to-Image Accuracy with Online RL—Without Sacrificing Quality or Diversity 1720

If you’ve ever struggled with diffusion models failing to follow detailed prompts—like “a golden retriever sitting to the left of…

12/19/2025Controllable Diffusion Models, Reinforcement Learning For Generative Models, Text-to-Image Generation
FlowTok: Unified Text-to-Image and Image-to-Text Generation with Compact 1D Tokens

FlowTok: Unified Text-to-Image and Image-to-Text Generation with Compact 1D Tokens 1082

FlowTok reimagines cross-modal generation by collapsing the traditionally complex boundary between text and images into a streamlined, efficient process. Unlike…

12/19/2025Image-to-text Generation, Multimodal Representation Learning, Text-to-Image Generation
Lumina-mGPT 2.0: A Standalone Autoregressive Image Generator That Unifies Multimodal Tasks Without Diffusion Dependencies

Lumina-mGPT 2.0: A Standalone Autoregressive Image Generator That Unifies Multimodal Tasks Without Diffusion Dependencies 1076

In the ever-evolving landscape of generative AI, image synthesis has long been dominated by diffusion models—powerful, yet often complex, resource-intensive,…

12/19/2025Controllable Image Synthesis, Image Editing, Text-to-Image Generation
AnyText: Generate and Edit Multilingual Text in AI Images with Pixel-Perfect Accuracy

AnyText: Generate and Edit Multilingual Text in AI Images with Pixel-Perfect Accuracy 4822

If you’ve ever tried using a standard AI image generator to create a poster, product mockup, or social media banner…

12/18/2025Multilingual Image Synthesis, Text-to-Image Generation, Visual Text Editing
OmniGen2: Unified Open-Source Multimodal Generation for Text-to-Image, Editing, and In-Context Creation

OmniGen2: Unified Open-Source Multimodal Generation for Text-to-Image, Editing, and In-Context Creation 3962

OmniGen2 is an open-source, unified generative model that seamlessly bridges text and vision in a single architecture. Unlike many multimodal…

12/17/2025In-context Generation, Instruction-guided Image Editing, Text-to-Image Generation

Posts pagination

Previous 1 2 3 Next
Copyright © 2026 PaperCodex.
  • Facebook
  • YouTube
  • Twitter

PaperCodex