Creating lifelike digital avatars that speak naturally with accurate lip movements and rich emotional expression has long been a challenge…
SkyThought: Boost Code Generation Accuracy Without Retraining—Even Small Models Beat GPT-4o-mini 3358
SkyThought is an open-source framework built around S*—a breakthrough test-time scaling approach designed specifically to elevate code generation performance in…
AniTalker: Generate Lifelike, Expressive Talking Faces from a Single Image and Audio Clip 1588
Imagine turning a static portrait—like the Mona Lisa or a headshot from your LinkedIn profile—into a vivid, talking avatar that…
Colossal-Auto: Automate Large Model Training with Zero Expertise in Parallelization or Checkpointing 41290
Training large-scale AI models—whether language models like LLaMA or video generators like Open-Sora—has become increasingly common, yet remains bottlenecked by…
EasyVolcap: Streamline Neural Volumetric Video Research with a Unified, Real-Time PyTorch Framework 1508
Neural volumetric video—capturing and rendering dynamic 3D scenes that can be viewed from any angle and time—is no longer just…
AnyText: Generate and Edit Multilingual Text in AI Images with Pixel-Perfect Accuracy 4822
If you’ve ever tried using a standard AI image generator to create a poster, product mockup, or social media banner…
DreamCraft3D: Generate Photorealistic, View-Consistent 3D Assets from a Single Image 2989
Creating high-quality 3D assets has traditionally required expert modeling skills, extensive manual labor, or expensive capture setups—barriers that limit accessibility…
S1: Boost Reasoning Performance with Just 1,000 Examples and Smart Test-Time Scaling 6613
In the rapidly evolving landscape of large language models (LLMs), achieving strong reasoning capabilities often comes at the cost of…
SwinIR: State-of-the-Art Image Restoration with Fewer Parameters and Higher Quality 5230
Image quality degradation—whether from compression, noise, or low resolution—is a persistent challenge across industries ranging from medical imaging to consumer…
PP-PicoDet: Real-Time Object Detection with SOTA Accuracy on Mobile and Edge Devices 13974
In today’s era of intelligent edge computing, deploying high-performance computer vision models on resource-constrained devices like smartphones, embedded sensors, and…