In today’s AI landscape, single-agent systems—powered by large language models (LLMs)—often hit a ceiling when tackling complex, multi-step problems. What…
Align Anything: The First Open Framework for Aligning Any-to-Any Multimodal Models with Human Intent 4562
As AI systems grow more capable across diverse data types—text, images, audio, and video—the challenge of aligning them with human…
SPHINX-X: Build Scalable Multimodal AI Faster with Unified Training, Diverse Data, and Flexible Model Sizes 2794
SPHINX-X is a next-generation family of Multimodal Large Language Models (MLLMs) designed to streamline the development, training, and deployment of…
Xorbits: Scale Pandas and NumPy Workflows to Clusters—With Just One Line of Code 1199
Data scientists and machine learning engineers routinely rely on pandas and NumPy for data wrangling, exploration, and modeling. These libraries…
DyVal: Dynamic, Contamination-Free Evaluation of LLM Reasoning Capabilities 2726
Evaluating large language models (LLMs) has become increasingly challenging. Traditional benchmarks—like MMLU, GSM8K, or Big-Bench Hard—are static, fixed in complexity,…
Caption Anything: Interactive, Multimodal Image Captioning Controlled by You 1770
Traditional image captioning systems produce static, one-size-fits-all descriptions—often generic, inflexible, and disconnected from actual user intent. What if you could…
OmniParser V2: One Unified Model for Text Spotting, Table Recognition, and Document Understanding 1800
In today’s data-driven world, businesses and researchers routinely process documents—scanned invoices, forms, tables, and receipts—to extract structured information. Traditionally, this…
ManimML: Animate Machine Learning Architectures Directly from Code—No Design Skills Needed 3269
As machine learning models grow increasingly complex—from deep convolutional networks to attention-based architectures—the ability to clearly communicate how they work…
Code-Optimise: Boost Code Correctness and Runtime Efficiency Without Trade-offs 2692
Modern code language models (CLMs) excel at generating functionally correct programs—but often at the cost of runtime efficiency. Conversely, efforts…
FederatedScope-LLM: Collaboratively Fine-Tune Large Language Models Without Sharing Private Data 1491
In today’s data-sensitive world, organizations increasingly want to harness the power of large language models (LLMs) while complying with strict…