Skip to content

PaperCodex

Subscribe
LMDrive: The First Language-Guided, Closed-Loop Autonomous Driving System for Human-Centric Navigation

LMDrive: The First Language-Guided, Closed-Loop Autonomous Driving System for Human-Centric Navigation 827

Autonomous driving has made remarkable strides, yet it still falters in complex urban settings—especially when confronted with rare, ambiguous, or…

01/13/2026Autonomous Driving, Embodied AI, Language-guided Control
ControlVideo: Training-Free, Controllable Text-to-Video Generation with Consistent Motion and Structure

ControlVideo: Training-Free, Controllable Text-to-Video Generation with Consistent Motion and Structure 851

Generating high-quality videos from text has long been a challenging frontier in generative AI—especially compared to the rapid advances in…

01/13/2026Controllable Video Synthesis, Structure-conditioned Generative Models, Text-to-Video Generation
Matcher: One-Shot Segmentation Without Training—Unlock Flexible, Label-Free Perception for Real-World Applications

Matcher: One-Shot Segmentation Without Training—Unlock Flexible, Label-Free Perception for Real-World Applications 522

In modern computer vision workflows, deploying accurate segmentation models often demands large annotated datasets, task-specific architectures, and costly retraining—barriers that…

01/13/2026One-shot Segmentation, Open-world Perception, Zero-shot Learning
SAD: Geometry-Aware RGBD Segmentation That Fixes SAM’s Over-Segmentation Problem

SAD: Geometry-Aware RGBD Segmentation That Fixes SAM’s Over-Segmentation Problem 859

The Segment Anything Model (SAM) revolutionized 2D image segmentation by enabling zero-shot, promptable mask generation from RGB images. However, SAM’s…

01/13/20263D Panoptic Segmentation, RGBD Segmentation, Zero-shot Semantic Segmentation
Prompt-Free Diffusion: Generate Images Without Writing a Single Text Prompt

Prompt-Free Diffusion: Generate Images Without Writing a Single Text Prompt 757

Text-to-image (T2I) diffusion models have revolutionized creative workflows—but they come with a hidden bottleneck: prompt engineering. Describing an image in…

01/13/2026Prompt-free Diffusion, Text-to-Image Generation, Visual-conditioned Image Synthesis
Uni-ControlNet: Unified Visual Control for Text-to-Image Generation Without Retraining Everything

Uni-ControlNet: Unified Visual Control for Text-to-Image Generation Without Retraining Everything 664

Generating high-quality images from text prompts has become remarkably powerful thanks to diffusion models like Stable Diffusion. Yet, for many…

01/13/2026Controllable Diffusion Models, Multimodal Conditioning, Text-to-Image Generation
FuseChat: Build Smarter, Smaller Chatbots by Fusing Top Open-Source LLMs—No Training From Scratch Needed

FuseChat: Build Smarter, Smaller Chatbots by Fusing Top Open-Source LLMs—No Training From Scratch Needed 584

In today’s fast-moving AI landscape, teams need high-performing chat models that are both capable and cost-efficient. Yet training large language…

01/13/2026Chatbot Deployment, Instruction Following, Knowledge Fusion
RAGChecker: Fine-Grained Diagnostics for Reliable Retrieval-Augmented Generation Evaluation

RAGChecker: Fine-Grained Diagnostics for Reliable Retrieval-Augmented Generation Evaluation 999

Retrieval-Augmented Generation (RAG) has become a cornerstone of modern AI applications, enabling systems to answer questions by combining external knowledge…

01/13/2026Claim-level Factuality Assessment, RAG Diagnostics, Retrieval-Augmented Generation Evaluation
MFTCoder: Boost Code LLM Performance with Multi-Task Fine-Tuning—Faster, Smarter, and Open Source

MFTCoder: Boost Code LLM Performance with Multi-Task Fine-Tuning—Faster, Smarter, and Open Source 705

If you’re building or maintaining AI-powered coding assistants, you’ve likely faced a frustrating trade-off: fine-tune a model for one specific…

01/13/2026Code Generation, Model Fine-tuning, Multi-task Learning
AgentGym: Build Generalist LLM Agents That Evolve Across Real-World Environments Without Constant Human Supervision

AgentGym: Build Generalist LLM Agents That Evolve Across Real-World Environments Without Constant Human Supervision 616

Building AI agents that can handle diverse, real-world tasks—and improve over time without hand-holding—is one of the biggest challenges in…

01/13/2026Agent Generalization, LLM-based Agents, Reinforcement Learning

Posts pagination

Previous 1 … 6 7 8 … 53 Next
Copyright © 2026 PaperCodex.
  • Facebook
  • YouTube
  • Twitter

PaperCodex