In many real-world scenarios—whether you’re analyzing patient outcomes in healthcare, consumer behavior in economics, or system failures in engineering—you can’t…
Kimi-Dev: Solve Real Software Bugs with a Test-Passing, Open-Source Coding LLM 1075
Kimi-Dev is a state-of-the-art open-source large language model (LLM) purpose-built for software engineering tasks. Unlike generic coding assistants that generate…
Vizier: Production-Grade Black-Box Optimization for Reliable Hyperparameter Tuning and System Configuration 1616
Optimizing complex systems—whether machine learning models, database configurations, or compiler flags—often feels like navigating a dark room: you know the…
AgentBench: Objectively Evaluate LLMs as Real-World Agents Across 8 Practical Environments 3017
As large language models (LLMs) increasingly power autonomous agents—from customer service bots to system administration tools—a critical question arises: Can…
FlipVQA-Miner: Automatically Extract High-Quality Visual QA Pairs from Textbooks for Reliable LLM Training 1737
Large Language Models (LLMs) and multimodal systems increasingly demand high-quality, human-authored supervision data—especially for tasks requiring reasoning, visual understanding, and…
Omnilingual ASR: Open-Source Speech Recognition for 1,600+ Languages—Including 500 Never Before Supported 2504
For decades, automatic speech recognition (ASR) has flourished in high-resource languages like English, Spanish, or Mandarin. But for the vast…
PokeeResearch: Open-Source, High-Accuracy Deep Research Agent with Self-Verification and RL-Optimized Reasoning 1595
In today’s fast-moving technical and research environments, teams need reliable, up-to-date answers to complex questions—without the black-box limitations or high…
MimicKit: Train Physics-Based Character Controllers with Motion Imitation and Reinforcement Learning 1196
Imagine needing realistic, physics-compliant character movement for a game, simulation, or robotics project—but without the months of trial, error, and…
DocLayout-YOLO: Real-Time, High-Accuracy Document Layout Detection Without the Speed-Accuracy Trade-Off 1870
Document layout analysis (DLA) is a foundational task in building real-world document understanding systems—whether you’re extracting structured data from invoices,…
TODS: Automated Outlier Detection for Multivariate Time Series – No ML Expertise Required 1637
In modern data-driven operations—whether monitoring industrial sensors, analyzing financial transactions, or securing IT infrastructure—unexpected anomalies can signal critical failures, fraud,…