PaperCodex

TFB: The Fair, Comprehensive Benchmark for Time Series Forecasting That Solves Reproducibility and Bias Problems 1625

Time series forecasting powers critical decisions across industries—from predicting electricity demand and traffic congestion to estimating disease spread and stock…

12/22/2025Multivariate Forecasting, Time-series Forecasting, Univariate Forecasting

CKnowEdit: Fix Chinese Linguistic, Factual & Logical Errors in LLMs Without Retraining 2667

Large language models (LLMs) have made remarkable progress in multilingual understanding—but their performance in Chinese remains uneven, especially when it…

12/22/2025Chinese NLP, Factual Correction, Knowledge Editing

FastViT: Achieve State-of-the-Art Speed and Accuracy for Vision Tasks on Mobile and Edge Devices 1974

FastViT is a high-performance hybrid vision transformer designed to deliver exceptional speed and accuracy—especially on resource-constrained platforms like mobile phones…

12/22/2025Image Classification, Object Detection, Semantic Segmentation

iTransformer: Invert Your Time Series Forecasting Architecture for Better Scalability, Generalization, and Simplicity 1824

Time series forecasting is a foundational task across finance, energy, logistics, and digital platforms—yet traditional Transformer-based models often struggle with…

12/22/2025Long-sequence Forecasting, Multivariate Time Series Forecasting, Zero-shot Time Series Generalization

InternLM-XComposer: Generate Rich Text-Image Content and Understand High-Res Visuals with Open, Commercially Free AI 2909

Overview For technical decision makers evaluating multimodal AI, choosing between closed-source APIs and open alternatives often means trading off control,…

12/22/2025Multimodal Understanding, Text-image Composition, vision-language modeling

Show-1: High-Quality, Efficient Text-to-Video Generation with Precise Prompt Alignment 1133

Text-to-video generation has rapidly evolved, yet technical teams still face a persistent trade-off: high-quality outputs often come at prohibitive computational…

12/22/2025Diffusion Models, Text-to-Video Generation, Video Synthesis

TinyLlama: A Fast, Efficient 1.1B Open Language Model for Edge Deployment and Speculative Decoding 8770

TinyLlama is a compact yet powerful open-source language model with just 1.1 billion parameters—but trained on an impressive 3 trillion…

12/22/2025On-device Inference, Speculative Decoding, Text Generation

SLAM3R: Real-Time Dense 3D Reconstruction from Monocular Video—No Camera Calibration Needed 1045

Introducing SLAM3R—a cutting-edge, end-to-end system that reconstructs high-quality, dense 3D scenes in real time using only a monocular RGB video…

12/22/20253D Reconstruction, Monocular SLAM, Neural Scene Representation

3DGUT: Real-Time 3D Reconstruction That Handles Distorted Cameras and Reflections Without Sacrificing Speed 1743

3D Gaussian Splatting (3DGS) revolutionized real-time 3D scene reconstruction by delivering photorealistic quality at high frame rates on consumer GPUs.…

12/22/20253D Reconstruction, Neural Rendering, Real-time Rendering

LLaVA-CoT: Step-by-Step Visual Reasoning for Reliable, Explainable Multimodal AI 2108

Most vision-language models (VLMs) today can describe what’s in an image—but they often falter when asked to reason about it.…

12/22/2025Explainable AI, Multimodal Reasoning, Visual Question Answering