Imagine a single AI model that doesn’t just “see” or “read”—but seamlessly blends images and text in both input and…
XuanCe: A Unified Deep Reinforcement Learning Library for Reliable, Cross-Framework AI Development 1008
Deep reinforcement learning (DRL) holds immense promise—from robotic control and autonomous systems to multi-agent coordination and game AI. Yet for…
AI-Trader: Benchmark Autonomous LLM Agents in Real Financial Markets with Zero Human Intervention 10216
Evaluating whether large language models (LLMs) can truly function as autonomous decision-makers in dynamic, real-world environments remains a fundamental challenge…
OmDet: Real-Time Open-Vocabulary Object Detection with Transformer Speed and Zero-Shot Accuracy 1360
OmDet is a breakthrough in open-vocabulary object detection (OVD)—a vision-language paradigm that enables models to recognize not just pre-defined object…
LeVo: Generate Full-Length, High-Fidelity Songs with Perfect Vocal-Instrument Harmony—Even on Consumer GPUs 1005
LeVo is a breakthrough in open-source AI music generation. Unlike many existing tools that produce fragmented, low-quality, or inconsistent audio,…
LMCache: Slash LLM Inference Latency and Multiply Throughput with Enterprise-Grade KV Cache Reuse 6375
Deploying large language models (LLMs) at scale introduces a familiar bottleneck: the growing size of Key-Value (KV) caches rapidly outpaces…
PyThaiNLP: The Essential Python Library for Accurate and Efficient Thai Language Processing 1092
Processing Thai text presents unique challenges for developers and data scientists. Unlike English and many other languages, Thai is written…
AnomalyGPT: Industrial Anomaly Detection Without Manual Thresholds or Labeled Anomalies 1043
In industrial quality control, detecting defects—like cracks in concrete, scratches on metal, or deformities in packaged goods—is critical. Yet traditional…
EasyPhoto: Generate Realistic, Identity-Preserving AI Portraits from Just 5–20 Photos 5188
In today’s fast-paced digital world, creating high-quality, personalized photos—whether for professional headshots, marketing campaigns, or custom avatars—often requires photography sessions,…
S-LoRA: Serve Thousands of Task-Specific LLMs Efficiently on a Single GPU 1879
Deploying dozens—or even thousands—of fine-tuned large language models (LLMs) has traditionally been a costly and complex endeavor. Each adapter typically…