Autonomous driving has made remarkable strides, yet it still falters in complex urban settings—especially when confronted with rare, ambiguous, or…
ControlVideo: Training-Free, Controllable Text-to-Video Generation with Consistent Motion and Structure 851
Generating high-quality videos from text has long been a challenging frontier in generative AI—especially compared to the rapid advances in…
Matcher: One-Shot Segmentation Without Training—Unlock Flexible, Label-Free Perception for Real-World Applications 522
In modern computer vision workflows, deploying accurate segmentation models often demands large annotated datasets, task-specific architectures, and costly retraining—barriers that…
SAD: Geometry-Aware RGBD Segmentation That Fixes SAM’s Over-Segmentation Problem 859
The Segment Anything Model (SAM) revolutionized 2D image segmentation by enabling zero-shot, promptable mask generation from RGB images. However, SAM’s…
Prompt-Free Diffusion: Generate Images Without Writing a Single Text Prompt 757
Text-to-image (T2I) diffusion models have revolutionized creative workflows—but they come with a hidden bottleneck: prompt engineering. Describing an image in…
Uni-ControlNet: Unified Visual Control for Text-to-Image Generation Without Retraining Everything 664
Generating high-quality images from text prompts has become remarkably powerful thanks to diffusion models like Stable Diffusion. Yet, for many…
FuseChat: Build Smarter, Smaller Chatbots by Fusing Top Open-Source LLMs—No Training From Scratch Needed 584
In today’s fast-moving AI landscape, teams need high-performing chat models that are both capable and cost-efficient. Yet training large language…
RAGChecker: Fine-Grained Diagnostics for Reliable Retrieval-Augmented Generation Evaluation 999
Retrieval-Augmented Generation (RAG) has become a cornerstone of modern AI applications, enabling systems to answer questions by combining external knowledge…
MFTCoder: Boost Code LLM Performance with Multi-Task Fine-Tuning—Faster, Smarter, and Open Source 705
If you’re building or maintaining AI-powered coding assistants, you’ve likely faced a frustrating trade-off: fine-tune a model for one specific…
AgentGym: Build Generalist LLM Agents That Evolve Across Real-World Environments Without Constant Human Supervision 616
Building AI agents that can handle diverse, real-world tasks—and improve over time without hand-holding—is one of the biggest challenges in…