Building AI agents that can handle diverse, real-world tasks—and improve over time without hand-holding—is one of the biggest challenges in…
Reinforcement Learning
RL4CO: Accelerate Reinforcement Learning for Combinatorial Optimization with a Unified, Reproducible Benchmark 757
Combinatorial optimization (CO) lies at the heart of countless real-world challenges—from vehicle routing and job scheduling to chip design and…
RLinf: Accelerate Large-Scale Reinforcement Learning for Agentic AI and Embodied Intelligence 503
Reinforcement learning (RL) is rapidly becoming the engine behind next-generation agentic AI—powering everything from math-reasoning language models to vision-guided robotic…
TTRL: Boost LLM Reasoning Without Labels Using Test-Time Reinforcement Learning 836
Imagine being able to improve a large language model’s (LLM) reasoning capabilities after deployment, using only unlabeled test data—no ground-truth…
SimpleVLA-RL: Boost Robotic Task Performance with Minimal Data Using Reinforcement Learning 762
Building capable robotic systems that understand vision, language, and action—commonly referred to as Vision-Language-Action (VLA) models—has become a central goal…
MimicKit: Train Physics-Based Character Controllers with Motion Imitation and Reinforcement Learning 1196
Imagine needing realistic, physics-compliant character movement for a game, simulation, or robotics project—but without the months of trial, error, and…
PRIME: Boost LLM Reasoning with Token-Level Rewards—No Step-by-Step Labels Needed 1783
If you’re working to improve large language models (LLMs) on hard reasoning tasks—like math problem solving or competitive programming—you’ve likely…
ELF: Train Real-Time Strategy AI Bots 10x Faster with a Lightweight, Flexible RL Platform 2094
Reinforcement learning (RL) for real-time strategy (RTS) games has long been bottlenecked by slow simulation, rigid environment interfaces, and high…
Reasoning Gym: Train and Evaluate Reasoning Models with Infinite, Verifiable Reinforcement Learning Environments 1265
If you’re building or evaluating reasoning-capable AI systems—especially large language models (LLMs)—you’ve likely hit a wall with static benchmarks. Traditional…
Gymnasium: A Standardized, Reproducible Interface for Reinforcement Learning Environments 10396
Reinforcement learning (RL) holds immense promise for solving complex decision-making problems—from robotics and game playing to resource optimization and autonomous…