Skip to content

PaperCodex

Subscribe
YOLOv13: Boost Real-Time Object Detection Accuracy Without Sacrificing Speed or Efficiency

YOLOv13: Boost Real-Time Object Detection Accuracy Without Sacrificing Speed or Efficiency 827

For engineers, researchers, and product teams building real-time vision systems—whether for surveillance cameras, autonomous drones, or mobile apps—achieving high detection…

01/05/2026Edge AI, Object Detection, Real-time Computer Vision
UniAnimate-DiT: High-Fidelity Human Animation from a Single Image and Pose Sequence – No Full Retraining Needed

UniAnimate-DiT: High-Fidelity Human Animation from a Single Image and Pose Sequence – No Full Retraining Needed 797

Animating a static human image into a realistic, temporally coherent video used to require massive datasets, complex pipelines, or retraining…

01/05/2026Diffusion Transformer, Human Image Animation, Video Generation
360-LLaMA-Factory: Plug-and-Play Sequence Parallelism for Long-Context SFT and DPO Without Rewriting Your Workflow

360-LLaMA-Factory: Plug-and-Play Sequence Parallelism for Long-Context SFT and DPO Without Rewriting Your Workflow 571

Training large language models (LLMs) on long sequences—whether for document-level instruction tuning, multi-modal reasoning, or complex alignment tasks—has long been…

01/05/2026Direct Preference Optimization, Long-Context Training, Supervised Fine-tuning
DeepResearcher: Train AI Research Agents That Think, Verify, and Adapt in the Real Web Environment

DeepResearcher: Train AI Research Agents That Think, Verify, and Adapt in the Real Web Environment 621

In today’s AI landscape, many organizations rely on large language models (LLMs) to automate complex research tasks—such as competitive analysis,…

01/05/2026Autonomous Research Agents, Reinforcement Learning For Information Retrieval, Web-grounded Reasoning
LLM×MapReduce: Generate Coherent Long-Form Articles from Extremely Long Inputs Using LLMs Efficiently

LLM×MapReduce: Generate Coherent Long-Form Articles from Extremely Long Inputs Using LLMs Efficiently 814

If you’ve ever tried using a large language model (LLM) to synthesize a detailed technical report from hundreds of research…

01/05/2026Document Synthesis, Long-context Reasoning, Long-form Generation
Waver: Generate Lifelike, High-Motion Videos in 1080p with One Unified Model

Waver: Generate Lifelike, High-Motion Videos in 1080p with One Unified Model 588

In the rapidly evolving world of generative AI, video generation has remained a particularly challenging frontier—especially when it comes to…

01/05/2026Image-to-Video Synthesis, Multimodal Generative Modeling, Text-to-Video Generation
VGGT-Long: Scalable Monocular 3D Reconstruction for Kilometer-Scale Real-World Sequences Without Retraining or Calibration

VGGT-Long: Scalable Monocular 3D Reconstruction for Kilometer-Scale Real-World Sequences Without Retraining or Calibration 552

Monocular 3D reconstruction has seen rapid advances thanks to foundation models capable of inferring rich geometric structure from single images.…

01/05/2026Large-scale SLAM, Monocular 3D Reconstruction, Vision Foundation Models
SimpleVLA-RL: Boost Robotic Task Performance with Minimal Data Using Reinforcement Learning

SimpleVLA-RL: Boost Robotic Task Performance with Minimal Data Using Reinforcement Learning 762

Building capable robotic systems that understand vision, language, and action—commonly referred to as Vision-Language-Action (VLA) models—has become a central goal…

01/05/2026Reinforcement Learning, Robotic Manipulation, Vision-Language-Action Modeling
PUSA: Generate High-Quality Video from Text or Images for $500—Not $100,000

PUSA: Generate High-Quality Video from Text or Images for $500—Not $100,000 645

Video generation has long been bottlenecked by two stubborn realities: astronomical training costs and rigid temporal modeling. Most state-of-the-art image-to-video…

01/05/2026Image-to-Video Synthesis, Multi-condition Video Diffusion, Text-to-Video Generation
Decoupled DMD: Unlock Ultra-Fast, High-Quality Image Generation with 8-Step Distillation

Decoupled DMD: Unlock Ultra-Fast, High-Quality Image Generation with 8-Step Distillation 8234

If you’re building or evaluating text-to-image systems that demand both speed and visual fidelity, Decoupled DMD offers a breakthrough in…

01/04/2026Diffusion Model Distillation, Few-step Image Synthesis, Text-to-Image Generation

Posts pagination

Previous 1 2 3 … 38 Next
Copyright © 2026 PaperCodex.
  • Facebook
  • YouTube
  • Twitter

PaperCodex