PaperCodex

Sparse VideoGen2: Accelerate High-Quality Video Generation by 2x—Without Retraining Models 597

Generating high-fidelity videos with diffusion models has long been bottlenecked by computational inefficiency. Even on powerful GPUs, producing just a…

01/13/2026Diffusion Models, Inference Acceleration, Video Generation

LLaMA-MoE: High-Performance Mixture-of-Experts LLM with Only 3.5B Active Parameters 994

If you’re a developer, researcher, or technical decision-maker working with large language models (LLMs), you’ve likely faced a tough trade-off:…

01/13/2026Efficient Inference, Language Modeling, Text Generation

MedLSAM: Slash Annotation Effort in 3D CT Segmentation with Fully Automatic Localization and SAM Integration 505

Medical image segmentation—especially in 3D CT scans—is a cornerstone of clinical decision support, surgical planning, and radiological research. Yet, despite…

01/13/20263D Medical Image Segmentation, Few-shot Localization, Foundation Model Adaptation

HaluEval: Detect and Benchmark LLM Hallucinations Across QA, Dialogue, and Summarization 536

Large language models (LLMs) like ChatGPT are transforming how we interact with AI—but they often “make things up.” These fabricated,…

01/13/2026Hallucination Detection, Knowledge-grounded Dialogue, Question Answering

pyvene: Intervene on Any PyTorch Model’s Internal States—No Code Rewriting Required 819

Imagine being able to precisely edit, steer, or probe a trained PyTorch model—without touching its source code or retraining it…

01/13/2026Interpretability, Model Editing, Robustness Evaluation

IoA: Enable Heterogeneous AI Agents to Collaborate Like the Internet — Solve Complex Tasks Beyond Single-Agent Limits 770

Imagine a world where AI agents—each with unique skills like web browsing, code execution, or data analysis—can autonomously find one…

01/13/2026Embodied AI, Multi-agent Collaboration, Retrieval-Augmented Generation

PaperCodex

Sparse VideoGen2: Accelerate High-Quality Video Generation by 2x—Without Retraining Models 597

LLaMA-MoE: High-Performance Mixture-of-Experts LLM with Only 3.5B Active Parameters 994

MedLSAM: Slash Annotation Effort in 3D CT Segmentation with Fully Automatic Localization and SAM Integration 505

HaluEval: Detect and Benchmark LLM Hallucinations Across QA, Dialogue, and Summarization 536

pyvene: Intervene on Any PyTorch Model’s Internal States—No Code Rewriting Required 819

IoA: Enable Heterogeneous AI Agents to Collaborate Like the Internet — Solve Complex Tasks Beyond Single-Agent Limits 770

SVFR: Restore Blurry, Damaged, or Black-and-White Face Videos in One Unified Workflow 835

HarmBench: A Standardized Framework to Evaluate LLM Safety Against Malicious Prompts 752

FSD V2: High-Performance, Fully Sparse 3D Object Detection for Autonomous Systems 868

LServe: Accelerate Long-Context LLM Inference with Unified Sparse Attention—No Accuracy Trade-Off 790