Awesome Vision-Language-Action Modeling Papers and Source Codes

SimpleVLA-RL: Boost Robotic Task Performance with Minimal Data Using Reinforcement Learning 762

Building capable robotic systems that understand vision, language, and action—commonly referred to as Vision-Language-Action (VLA) models—has become a central goal…

01/05/2026Reinforcement Learning, Robotic Manipulation, Vision-Language-Action Modeling

SmolVLA: High-Performance Vision-Language-Action Robotics on a Single GPU 20075

SmolVLA is a compact yet capable Vision-Language-Action (VLA) model designed to bring state-of-the-art robot control within reach of researchers, educators,…

12/18/2025Imitation Learning, Robotic Manipulation, Vision-Language-Action Modeling

ShowUI: Open-Source Vision-Language-Action Model for Human-Like GUI Automation from Screenshots 1509

In today’s digital workflows, automating interactions with graphical user interfaces (GUIs)—whether on websites, mobile apps, or desktop software—is a high-value…

12/17/2025GUI Automation, Vision-Language-Action Modeling, Zero-Shot UI Grounding