Modern large language model (LLM) applications increasingly rely on structured outputs—think JSON responses for APIs, XML configuration files, or tool-call…
Lumina-mGPT 2.0: A Standalone Autoregressive Image Generator That Unifies Multimodal Tasks Without Diffusion Dependencies 1076
In the ever-evolving landscape of generative AI, image synthesis has long been dominated by diffusion models—powerful, yet often complex, resource-intensive,…
USO: Unified Image Generation that Preserves Subjects and Applies Styles in One Framework 1194
Generative AI has made remarkable strides in image synthesis, yet many tools force users to choose between style-driven and subject-driven…
GLM-4.5: Open-Source MoE LLM for High-Performance Agentic Reasoning and Coding 3288
GLM-4.5 is an open-source, high-performance Mixture-of-Experts (MoE) large language model engineered specifically for intelligent agents that need to reason, code,…
TorchAO: Unified PyTorch-Native Optimization for Faster Training and Efficient LLM Inference 2559
Deploying large AI models in production often involves a fragmented toolchain: one set of libraries for training, another for quantization,…
CodeGen: Open-Source LLMs That Generate Code from Natural Language—Smarter, Faster, and Free 5157
In today’s fast-paced software development landscape, the ability to translate natural language instructions into functional code is no longer science…
Attentive Reasoning Queries: Boost LLM Instruction-Following Accuracy in Business-Critical Applications 16725
Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks—from answering questions to generating code. However,…
MagicTime: Generate Realistic Time-Lapse Videos That Simulate Real-World Physical Transformations 1342
Most text-to-video (T2V) models today excel at generating short clips of people walking, cars driving, or birds flying—but they struggle…
YOLOE: Real-Time Open-Vocabulary Object Detection and Segmentation Without Compromise 1939
Conventional object detectors like YOLOv8 are fast, reliable, and widely deployed—but they come with a critical limitation: they can only…
CogAgent: Automate Any GUI with Vision—No Code or HTML Needed 1104
Imagine giving a natural language instruction like “Mark all unread emails as read” or “Filter Amazon search results to show…