Skip to content

PaperCodex

Subscribe
OmniGen: One Unified Model for All Image Generation Tasks—No Plugins, No Preprocessing, Just Prompts

OmniGen: One Unified Model for All Image Generation Tasks—No Plugins, No Preprocessing, Just Prompts 4282

Modern image generation is powerful—but fragmented. Depending on your goal—generating from text, editing existing images, preserving a person’s identity, or…

12/19/2025Image Editing, Subject-driven Generation, Text-to-Image Generation
AniPortrait: Generate Photorealistic Talking-Head Videos from a Single Image and Audio Clip

AniPortrait: Generate Photorealistic Talking-Head Videos from a Single Image and Audio Clip 5006

Creating lifelike, animated human faces used to require complex pipelines—motion capture rigs, professional voice actors, or hours of post-production. But…

12/19/2025Audio-driven Animation, Face Reenactment, Portrait Animation
GaussianObject: High-Quality 3D Reconstruction from Just Four Images—No COLMAP Required

GaussianObject: High-Quality 3D Reconstruction from Just Four Images—No COLMAP Required 1120

Creating photorealistic 3D models of real-world objects typically demands dozens—or even hundreds—of input images captured from carefully calibrated viewpoints. This…

12/19/20253D Object Reconstruction, Gaussian Splatting, Sparse-view Synthesis
AM-RADIO: Unify Vision Foundation Models into One High-Performance Backbone for Multimodal, Segmentation, and Detection Tasks

AM-RADIO: Unify Vision Foundation Models into One High-Performance Backbone for Multimodal, Segmentation, and Detection Tasks 1357

In modern computer vision, practitioners often juggle multiple foundation models—CLIP for vision-language alignment, DINOv2 for dense feature extraction, and SAM…

12/19/2025Object Detection, Semantic Segmentation, Vision-language Understanding
Semantic Operators: Declarative, Fast, and Accurate AI-Powered Data Processing for Unstructured and Structured Data

Semantic Operators: Declarative, Fast, and Accurate AI-Powered Data Processing for Unstructured and Structured Data 1484

Processing unstructured data—like free-form text, documents, or multimodal inputs—with large language models (LLMs) has become essential across industries, from biomedical…

12/19/2025LLM-powered Analytics, Semantic Data Processing, Unstructured Data Transformation
NeedleBench: Rigorously Evaluate LLM Retrieval and Reasoning in Long-Context Scenarios

NeedleBench: Rigorously Evaluate LLM Retrieval and Reasoning in Long-Context Scenarios 6409

Evaluating how well large language models (LLMs) retrieve critical facts and perform reasoning over long documents remains a major challenge…

12/19/2025Complex Reasoning, Long-context Retrieval, Synthetic Benchmarking
AgentVerse: Build Collaborative LLM Agent Teams for Real Tasks or Behavioral Simulation

AgentVerse: Build Collaborative LLM Agent Teams for Real Tasks or Behavioral Simulation 4884

In today’s AI landscape, single-agent systems—powered by large language models (LLMs)—often hit a ceiling when tackling complex, multi-step problems. What…

12/19/2025Behavioral Simulation, Multi-agent Collaboration, Task Automation
Align Anything: The First Open Framework for Aligning Any-to-Any Multimodal Models with Human Intent

Align Anything: The First Open Framework for Aligning Any-to-Any Multimodal Models with Human Intent 4562

As AI systems grow more capable across diverse data types—text, images, audio, and video—the challenge of aligning them with human…

12/19/2025Instruction Tuning, Multimodal Alignment, Reinforcement Learning From Human Feedback
SPHINX-X: Build Scalable Multimodal AI Faster with Unified Training, Diverse Data, and Flexible Model Sizes

SPHINX-X: Build Scalable Multimodal AI Faster with Unified Training, Diverse Data, and Flexible Model Sizes 2794

SPHINX-X is a next-generation family of Multimodal Large Language Models (MLLMs) designed to streamline the development, training, and deployment of…

12/19/2025Document Intelligence, Multimodal Understanding, vision-language modeling
Xorbits: Scale Pandas and NumPy Workflows to Clusters—With Just One Line of Code

Xorbits: Scale Pandas and NumPy Workflows to Clusters—With Just One Line of Code 1199

Data scientists and machine learning engineers routinely rely on pandas and NumPy for data wrangling, exploration, and modeling. These libraries…

12/19/2025Data Preprocessing, Distributed Computing, Scalable Machine Learning

Posts pagination

Previous 1 … 25 26 27 … 43 Next
Copyright © 2026 PaperCodex.
  • Facebook
  • YouTube
  • Twitter

PaperCodex