Overview For technical decision makers evaluating multimodal AI, choosing between closed-source APIs and open alternatives often means trading off control,…
Show-1: High-Quality, Efficient Text-to-Video Generation with Precise Prompt Alignment 1133
Text-to-video generation has rapidly evolved, yet technical teams still face a persistent trade-off: high-quality outputs often come at prohibitive computational…
TinyLlama: A Fast, Efficient 1.1B Open Language Model for Edge Deployment and Speculative Decoding 8770
TinyLlama is a compact yet powerful open-source language model with just 1.1 billion parameters—but trained on an impressive 3 trillion…
SLAM3R: Real-Time Dense 3D Reconstruction from Monocular Video—No Camera Calibration Needed 1045
Introducing SLAM3R—a cutting-edge, end-to-end system that reconstructs high-quality, dense 3D scenes in real time using only a monocular RGB video…
3DGUT: Real-Time 3D Reconstruction That Handles Distorted Cameras and Reflections Without Sacrificing Speed 1743
3D Gaussian Splatting (3DGS) revolutionized real-time 3D scene reconstruction by delivering photorealistic quality at high frame rates on consumer GPUs.…
LLaVA-CoT: Step-by-Step Visual Reasoning for Reliable, Explainable Multimodal AI 2108
Most vision-language models (VLMs) today can describe what’s in an image—but they often falter when asked to reason about it.…
TEQ: Accurate 3- and 4-Bit LLM Quantization Without Inference Overhead 2544
Deploying large language models (LLMs) in production often runs into a hard trade-off: reduce model size and latency through quantization,…
YOLOv9: Train-from-Scratch Object Detection That Beats Pretrained Models with Programmable Gradient Information 9391
YOLOv9 marks a significant leap forward in real-time object detection by directly confronting a long-standing but often overlooked problem in…
Mulberry: Step-by-Step Multimodal Reasoning with o1-Like Reflection for Trustworthy AI Decisions 1217
Traditional multimodal large language models (MLLMs) often produce answers without revealing how they got there—especially when dealing with complex questions…
ELF: Train Real-Time Strategy AI Bots 10x Faster with a Lightweight, Flexible RL Platform 2094
Reinforcement learning (RL) for real-time strategy (RTS) games has long been bottlenecked by slow simulation, rigid environment interfaces, and high…