GLM-130B: A Truly Open, Bilingual 130B-Language Model That Runs on Consumer GPUs

GLM-130B: A Truly Open, Bilingual 130B-Language Model That Runs on Consumer GPUs

SmoothQuant: Accurate 8-Bit LLM Inference Without Retraining – Slash Memory and Boost Speed

SmoothQuant: Accurate 8-Bit LLM Inference Without Retraining – Slash Memory and Boost Speed

Magika: AI-Powered File Type Detection with 99% Accuracy and Millisecond Speed

Magika: AI-Powered File Type Detection with 99% Accuracy and Millisecond Speed

Self-Instruct: Bootstrap High-Quality Instruction Data Without Human Annotations

Self-Instruct: Bootstrap High-Quality Instruction Data Without Human Annotations

TTRL: Boost LLM Reasoning Without Labels Using Test-Time Reinforcement Learning

TTRL: Boost LLM Reasoning Without Labels Using Test-Time Reinforcement Learning

GLM-130B: A Truly Open, Bilingual 130B-Language Model That Runs on Consumer GPUs

GLM-130B: A Truly Open, Bilingual 130B-Language Model That Runs on Consumer GPUs

SmoothQuant: Accurate 8-Bit LLM Inference Without Retraining – Slash Memory and Boost Speed

SmoothQuant: Accurate 8-Bit LLM Inference Without Retraining – Slash Memory and Boost Speed