PaperCodex

SchurVINS: High-Accuracy Visual-Inertial Navigation with Low Computational Overhead for Resource-Constrained Devices 553

Visual-Inertial Navigation Systems (VINS) are critical for applications like drones, robotics, and augmented reality, where precise real-time localization is required…

01/13/2026SLAM, State Estimation, Visual-Inertial Odometry

AgentLite: Build Task-Oriented LLM Agents Fast—Without the Framework Bloat 634

Developing effective, task-oriented agents powered by large language models (LLMs) has become a priority for researchers and developers alike. However,…

01/13/2026LLM Reasoning, Multi-agent Systems, Task-oriented Agents

RepoAgent: Auto-Generate & Maintain Repository-Level Code Docs with LLMs 801

Keeping code documentation up to date is one of the most universally acknowledged yet consistently neglected tasks in software development.…

01/13/2026Code Documentation Generation, LLM-powered Software Maintenance, Repository-level Documentation

Sparse VideoGen2: Accelerate Video Diffusion Models 2.3x Without Retraining or Quality Loss 596

Video generation using diffusion transformers (DiTs) has reached remarkable visual fidelity—but at a steep computational cost. The quadratic complexity of…

01/13/2026Diffusion Models, Efficient Inference, Video Generation

SSLRec: A Unified, Plug-and-Play Framework for Self-Supervised Recommendation Systems 535

Recommender systems are foundational to modern digital experiences—from streaming platforms to e-commerce—but they face a persistent challenge: user interaction data…

01/13/2026Collaborative Filtering, Recommendation Systems, Self-supervised Learning

MINIMA: Universal Cross-Modality Image Matching Without Custom Models for Every Sensor 544

In real-world computer vision systems—whether for autonomous vehicles, remote sensing, or robotic inspection—images rarely come from a single type of…

01/13/2026Cross-modality Image Matching, Multimodal Perception, Zero-shot Generalization