Lumina-Image 2.0 is a state-of-the-art open-source text-to-image (T2I) generation framework that delivers exceptional visual fidelity and prompt adherence while maintaining…
Video-R1: Boost Video Reasoning in MLLMs with Efficient RL—Outperforming GPT-4o on Spatial Tasks 709
Video understanding has long been a bottleneck for multimodal large language models (MLLMs). While models can recognize objects or scenes…
PharMolixFM: High-Accuracy, All-Atom Molecular Modeling for Real-World Drug Discovery and Structural Biology 925
PharMolixFM is an all-atom foundation model purpose-built for molecular modeling and generation, jointly developed by PharMolix Inc. and the Institute…
ActionStudio: Unify, Train, and Deploy Large Action Models 9x Faster for Autonomous Agents 563
As autonomous AI agents become central to real-world applications—from customer service bots to robotic process automation—the demand for Large Action…
Text-to-LoRA: Instantly Customize LLMs with Plain English—No Training or Datasets Required 889
Large language models (LLMs) are powerful, but adapting them to specific tasks often demands significant effort: collecting labeled data, tuning…
MemoryOS: Give Your AI Agent Long-Term Memory and Personalized Context with an OS-Inspired Architecture 767
Most AI agents powered by Large Language Models (LLMs) struggle with a fundamental limitation: their fixed context windows. Once a…
DeepEyes: Enable Vision-Language Models to “Think with Images” and Solve Complex Visual Reasoning Tasks 858
Most modern Vision-Language Models (VLMs) treat images as static inputs—processed once, then reasoned about using purely text-based logic. But humans…
OpenGait: High-Accuracy, Open-Source Gait Recognition That Filters Out Clothing, Backgrounds, and Noise 918
If you’re evaluating biometric identification systems that work at a distance—without requiring cooperation, contact, or even clear facial visibility—gait recognition…
LLMC+: Plug-and-Play Compression for Vision-Language and Large Language Models Without Retraining 577
Deploying large vision-language models (VLMs) and large language models (LLMs) in real-world applications is often bottlenecked by their massive size,…
app.build: Generate Production-Ready, Validated Full-Stack Apps from a Single Prompt 606
Imagine turning a simple idea—like “a task manager with user authentication, real-time updates, and a clean UI”—into a fully working,…