Skip to content

PaperCodex

Subscribe
Emu3.5: A Native Multimodal World Model for Unified Vision-Language Generation and Reasoning

Emu3.5: A Native Multimodal World Model for Unified Vision-Language Generation and Reasoning 1372

Imagine a single AI model that doesn’t just “see” or “read”—but seamlessly blends images and text in both input and…

01/04/2026Multimodal Generation, vision-language modeling, World Modeling
XuanCe: A Unified Deep Reinforcement Learning Library for Reliable, Cross-Framework AI Development

XuanCe: A Unified Deep Reinforcement Learning Library for Reliable, Cross-Framework AI Development 1008

Deep reinforcement learning (DRL) holds immense promise—from robotic control and autonomous systems to multi-agent coordination and game AI. Yet for…

01/04/2026Deep Reinforcement Learning, Model-Based Reinforcement Learning, Multi-Agent Reinforcement Learning
AI-Trader: Benchmark Autonomous LLM Agents in Real Financial Markets with Zero Human Intervention

AI-Trader: Benchmark Autonomous LLM Agents in Real Financial Markets with Zero Human Intervention 10216

Evaluating whether large language models (LLMs) can truly function as autonomous decision-makers in dynamic, real-world environments remains a fundamental challenge…

01/04/2026Autonomous Agent Evaluation, Financial Decision-Making, Real-Time LLM Benchmarking
OmDet: Real-Time Open-Vocabulary Object Detection with Transformer Speed and Zero-Shot Accuracy

OmDet: Real-Time Open-Vocabulary Object Detection with Transformer Speed and Zero-Shot Accuracy 1360

OmDet is a breakthrough in open-vocabulary object detection (OVD)—a vision-language paradigm that enables models to recognize not just pre-defined object…

01/04/2026Open-vocabulary Object Detection, Real-time Object Detection, Zero-shot Object Detection
LeVo: Generate Full-Length, High-Fidelity Songs with Perfect Vocal-Instrument Harmony—Even on Consumer GPUs

LeVo: Generate Full-Length, High-Fidelity Songs with Perfect Vocal-Instrument Harmony—Even on Consumer GPUs 1005

LeVo is a breakthrough in open-source AI music generation. Unlike many existing tools that produce fragmented, low-quality, or inconsistent audio,…

01/04/2026AI Music Generation, Multimodal Sequence Modeling, Text-to-music Synthesis
LMCache: Slash LLM Inference Latency and Multiply Throughput with Enterprise-Grade KV Cache Reuse

LMCache: Slash LLM Inference Latency and Multiply Throughput with Enterprise-Grade KV Cache Reuse 6375

Deploying large language models (LLMs) at scale introduces a familiar bottleneck: the growing size of Key-Value (KV) caches rapidly outpaces…

01/04/2026KV Cache Reuse, LLM Inference Optimization, Retrieval-Augmented Generation
PyThaiNLP: The Essential Python Library for Accurate and Efficient Thai Language Processing

PyThaiNLP: The Essential Python Library for Accurate and Efficient Thai Language Processing 1092

Processing Thai text presents unique challenges for developers and data scientists. Unlike English and many other languages, Thai is written…

01/04/2026Part-of-Speech Tagging, Text Normalization, Tokenization
AnomalyGPT: Industrial Anomaly Detection Without Manual Thresholds or Labeled Anomalies

AnomalyGPT: Industrial Anomaly Detection Without Manual Thresholds or Labeled Anomalies 1043

In industrial quality control, detecting defects—like cracks in concrete, scratches on metal, or deformities in packaged goods—is critical. Yet traditional…

01/04/2026Few-shot Learning, Industrial Anomaly Detection, vision-language modeling
EasyPhoto: Generate Realistic, Identity-Preserving AI Portraits from Just 5–20 Photos

EasyPhoto: Generate Realistic, Identity-Preserving AI Portraits from Just 5–20 Photos 5188

In today’s fast-paced digital world, creating high-quality, personalized photos—whether for professional headshots, marketing campaigns, or custom avatars—often requires photography sessions,…

01/04/2026AI Portrait Generation, Identity-Preserving Image Synthesis, LoRA-Based Personalization
S-LoRA: Serve Thousands of Task-Specific LLMs Efficiently on a Single GPU

S-LoRA: Serve Thousands of Task-Specific LLMs Efficiently on a Single GPU 1879

Deploying dozens—or even thousands—of fine-tuned large language models (LLMs) has traditionally been a costly and complex endeavor. Each adapter typically…

01/04/2026Large Language Model Deployment, Multi-adapter Serving, Parameter-Efficient Fine-Tuning

Posts pagination

Previous 1 … 8 9 10 … 43 Next
Copyright © 2026 PaperCodex.
  • Facebook
  • YouTube
  • Twitter

PaperCodex