Skip to content

PaperCodex

Subscribe

On-device Inference

TinyLlama: A Fast, Efficient 1.1B Open Language Model for Edge Deployment and Speculative Decoding

TinyLlama: A Fast, Efficient 1.1B Open Language Model for Edge Deployment and Speculative Decoding 8770

TinyLlama is a compact yet powerful open-source language model with just 1.1 billion parameters—but trained on an impressive 3 trillion…

12/22/2025On-device Inference, Speculative Decoding, Text Generation
MNN: Run Large Language Models and Vision AI Offline on Mobile with a Lightweight, High-Performance Inference Engine

MNN: Run Large Language Models and Vision AI Offline on Mobile with a Lightweight, High-Performance Inference Engine 13694

Mobile Neural Network (MNN) is an open-source, lightweight deep learning inference engine developed by Alibaba Group to bring powerful AI…

12/18/2025Large Language Model Deployment, Multimodal AI, On-device Inference
AgentCPM-GUI: On-Device AI Agent for Bilingual Mobile Automation with Reinforcement Fine-Tuning

AgentCPM-GUI: On-Device AI Agent for Bilingual Mobile Automation with Reinforcement Fine-Tuning 1142

AgentCPM-GUI is an open-source, on-device large language model (LLM) agent designed to understand smartphone screenshots and autonomously perform user-specified tasks…

12/18/2025GUI Agent, Mobile Automation, On-device Inference
BitNet: Run 1.58-Bit LLMs Locally on CPUs with 6x Speedup and 82% Less Energy

BitNet: Run 1.58-Bit LLMs Locally on CPUs with 6x Speedup and 82% Less Energy 24452

Running large language models (LLMs) used to require powerful GPUs, expensive cloud infrastructure, or specialized hardware—until BitNet changed the game.…

12/12/2025Efficient LLM Deployment, On-device Inference, Text Generation
Copyright © 2026 PaperCodex.
  • Facebook
  • YouTube
  • Twitter

PaperCodex