PaperCodex

MARS: Accelerate Large Model Training with Variance-Reduced Optimization That Actually Works 712

Training large language models and vision architectures is notoriously slow, unstable, and expensive. Practitioners routinely face diminishing returns from standard…

01/13/2026Large Language Model Training, Variance-reduced Optimization, Vision Model Optimization

MarkLLM: Open-Source Toolkit for Detectable, Invisible Watermarks in LLM-Generated Text 632

As large language models (LLMs) become deeply embedded in enterprise workflows, content platforms, and research pipelines, the ability to verify…

01/13/2026AI-generated Text Detection, Content Provenance Verification, LLM Watermarking

Uni-MoE: Build One Unified Multimodal AI Instead of Five Separate Models 773

Imagine managing a project that needs to understand speech, analyze images, interpret video frames, and respond to written prompts—all within…

01/13/2026Instruction Tuning, Mixture-of-Experts, Multimodal Learning

OpenEMMA: Open-Source End-to-End Autonomous Driving with Multimodal Reasoning and Transparent Planning 873

Autonomous driving research has long been bottlenecked by the need for massive datasets, expensive compute infrastructure, and proprietary end-to-end frameworks.…

01/13/2026End-to-End Autonomous Driving, Multimodal Reasoning, Vision-language Models

IDRNet: Boost Semantic Segmentation Accuracy with Smarter Context Modeling—No Heavy Priors Required 876

If you’re building computer vision systems that rely on pixel-perfect understanding—like autonomous driving, medical imaging analysis, or retail scene parsing—you’ve…

01/13/2026Context Modeling, Dense Prediction, Semantic Segmentation

CCF: Build Secure Multi-Party Applications with Confidentiality, Integrity, and High Availability—Even on Untrusted Cloud Infrastructure 840

In today’s cloud-first world, organizations increasingly need to collaborate across trust boundaries—whether in finance, healthcare, supply chains, or regulatory compliance.…

01/13/2026Confidential Computing, Secure Multi-party Computation, Trusted Execution Environments

Arc2Face: Generate Identity-Consistent Faces with Precise Expression Control for AI Storytelling and Avatars 768

Creating realistic, diverse human faces that remain visually consistent with a specific identity—while allowing fine-grained control over expressions—is a persistent…

01/13/2026Expression Control, Face Generation, Identity-consistent Synthesis