PaperCodex

Lite-HRNet: High-Accuracy Human Pose Estimation and Semantic Segmentation with Minimal Compute 894

When building real-time vision applications for mobile, embedded, or edge devices, developers often face a tough trade-off: accuracy versus efficiency.…

01/13/2026Human Pose Estimation, Lightweight Neural Networks, Semantic Segmentation

DynamicViT: Slash Vision Transformer Compute by 30% Without Sacrificing Accuracy 641

Vision Transformers (ViTs) have revolutionized computer vision, but their computational demands remain a major barrier for real-world deployment—especially on edge…

01/13/2026Efficient Vision Transformers, Image Classification, Model Acceleration

EGVSR: Real-Time 4K Video Super-Resolution with High Visual Quality and Edge Deployment Ready 945

Video super-resolution (VSR) has long promised to breathe new life into low-quality video content—enhancing resolution, restoring detail, and eliminating the…

01/13/20264K Upscaling, Real-time Video Processing, Video Super-resolution

Mengzi: Lightweight, High-Performance Chinese Pre-Trained Models for Efficient NLP Deployment 540

In recent years, pre-trained language models (PLMs) have revolutionized natural language processing (NLP), delivering state-of-the-art results across a wide spectrum…

01/13/2026Multimodal Learning, Text Classification, Text Generation

MambaIR: High-Quality Image Restoration with Efficient State-Space Models 966

Image restoration—recovering clean, high-resolution images from degraded inputs—is a foundational task in computer vision with applications ranging from smartphone photography…

01/13/2026Image Denoising, Image Restoration, Super-resolution

NVIDIA FLARE: Build Privacy-Preserving AI Across Organizations—Without Moving Data 847

In today’s data-driven world, organizations often face a fundamental dilemma: they want to build powerful, generalizable AI models, but their…

01/13/2026Collaborative AI, Federated Learning, Privacy-preserving Machine Learning

Fast-BEV: High-Speed, High-Accuracy Bird’s-Eye View Perception for Real-World Autonomous Driving Systems 754

Fast-BEV emerges as a compelling solution to a longstanding challenge in autonomous vehicle (AV) perception: achieving both high inference speed…

01/13/2026Autonomous Driving Perception, Bird's-Eye View Perception, Real-time Object Detection

CodeI/O: Boost LLM Reasoning Across Domains by Learning from Code Input-Output Patterns 563

Large language models (LLMs) have demonstrated impressive capabilities in code generation and narrow reasoning tasks like mathematics. Yet, when it…

01/13/2026Procedural Reasoning, Reasoning Enhancement, Verifiable Chain-of-thought

SEED-Voken: Scalable, High-Fidelity Visual Tokenization for Autoregressive Image and Video Generation 984

SEED-Voken is an open-source toolkit developed by Tencent ARC that delivers state-of-the-art visual tokenizers tailored for autoregressive visual generation. Built…

01/13/2026Autoregressive Image Generation, Video Tokenization, Visual Representation Learning

TSMixer: A Lightweight, High-Performance Alternative to Transformers for Multivariate Time Series Forecasting 738

Time series forecasting is essential across industries—from predicting energy demand and stock trends to managing supply chains and monitoring IoT…

01/13/2026Foundation Models, Multivariate Forecasting, Time-series Forecasting