Skip to content

PaperCodex

Subscribe

Multimodal AI

Cube: Generate 3D Assets from Text Prompts—No Modeling Skills Required

Cube: Generate 3D Assets from Text Prompts—No Modeling Skills Required 844

Imagine describing a “mechanical lobster with tank treads” in plain English and instantly getting a usable 3D model—no Blender expertise,…

01/09/20263D Shape Modeling, Multimodal AI, Text-to-3D Generation
AudioGPT: Build Spoken AI Experiences with Speech, Music, Sound, and Talking Head Generation in One Unified System

AudioGPT: Build Spoken AI Experiences with Speech, Music, Sound, and Talking Head Generation in One Unified System 10209

AudioGPT is a multimodal AI system that bridges the gap between large language models (LLMs) like ChatGPT and the rich…

12/18/2025Audio Generation, Multimodal AI, Speech Synthesis
MNN: Run Large Language Models and Vision AI Offline on Mobile with a Lightweight, High-Performance Inference Engine

MNN: Run Large Language Models and Vision AI Offline on Mobile with a Lightweight, High-Performance Inference Engine 13694

Mobile Neural Network (MNN) is an open-source, lightweight deep learning inference engine developed by Alibaba Group to bring powerful AI…

12/18/2025Large Language Model Deployment, Multimodal AI, On-device Inference
Copyright © 2026 PaperCodex.
  • Facebook
  • YouTube
  • Twitter

PaperCodex