Awesome Mixture-of-Experts Papers and Source Codes

Uni-MoE: Build One Unified Multimodal AI Instead of Five Separate Models 773

Imagine managing a project that needs to understand speech, analyze images, interpret video frames, and respond to written prompts—all within…

01/13/2026Instruction Tuning, Mixture-of-Experts, Multimodal Learning

Megatron-LM: Train Billion-Parameter Transformer Models Efficiently on NVIDIA GPUs at Scale 14515

If you’re building or scaling large language models (LLMs) and have access to NVIDIA GPU clusters, Megatron-LM—developed by NVIDIA—is one…

12/26/2025Distributed Deep Learning, Large Language Model Training, Mixture-of-Experts

GLM-4.5: Open-Source MoE LLM for High-Performance Agentic Reasoning and Coding 3288

GLM-4.5 is an open-source, high-performance Mixture-of-Experts (MoE) large language model engineered specifically for intelligent agents that need to reason, code,…

12/19/2025Agentic Reasoning, Code Generation, Mixture-of-Experts