PaperCodex

MoBA: Efficient Long-Context Attention for LLMs Without Compromising Reasoning Quality 2014

Handling long input sequences—ranging from tens of thousands to over a million tokens—is no longer a theoretical benchmark but a…

12/22/2025Efficient Attention, Long-context Language Modeling, Sparse Attention

TextBox 2.0: A Unified Library for Rapid Text Generation with Pre-Trained Language Models 1096

If you’ve ever struggled to compare BART, T5, and a custom Chinese language model on summarization, translation, or dialogue generation—only…

12/22/2025Machine Translation, Summarization, Text Generation

ZoeDepth: Metric-Accurate, Zero-Shot Monocular Depth Estimation for Real-World Applications 2755

Depth estimation from a single RGB image—monocular depth estimation—is a foundational task in computer vision with far-reaching implications in robotics,…

12/22/2025Metric Depth Estimation, Monocular Depth Estimation, Zero-shot Depth Prediction

VAD: Vectorized End-to-End Autonomous Driving for Faster, Safer Planning 1159

Autonomous driving systems must balance accuracy, safety, and real-time performance. Traditional approaches often rely on dense rasterized representations of the…

12/22/2025End-to-End Autonomous Driving, Trajectory Planning, Vectorized Scene Representation

InfiniteYou: High-Fidelity Identity-Preserving Image Generation with Flexible Prompt Control 2652

Personalized image generation has long struggled with a fundamental trade-off: how to maintain strong identity fidelity while enabling flexible, high-quality…

12/22/2025Identity-Preserving Image Generation, Personalized Diffusion Models, Text-to-image Synthesis

OmniDocBench: A Real-World, Fine-Grained Benchmark for Fair and Comprehensive PDF Document Parsing Evaluation 1279

Evaluating document parsing systems has long been a frustrating exercise in inconsistency. Many existing benchmarks focus narrowly on clean academic…

12/22/2025Document Parsing, Layout Analysis, Multimodal Document Understanding

detrex: A Unified, Modular Benchmark for Detection Transformers—Accelerate Object Detection, Segmentation, and Pose Estimation Research 2250

If you’re evaluating object detection frameworks for a new computer vision project, you’ve likely encountered the rise of DETR (Detection…

12/22/2025Instance Segmentation, Object Detection, Pose Estimation