Skip to content

PaperCodex

Subscribe
FastSAM: Real-Time Image Segmentation at 50x Speed Without Sacrificing Accuracy

FastSAM: Real-Time Image Segmentation at 50x Speed Without Sacrificing Accuracy 8193

In today’s fast-paced computer vision landscape, high-quality image segmentation is no longer a luxury—it’s a necessity. Yet, despite the groundbreaking…

12/26/2025Image Segmentation, Instance Segmentation, Zero-shot Segmentation
Tortoise-TTS: High-Quality, Multi-Voice Text-to-Speech with Realistic Prosody and Open-Source Flexibility

Tortoise-TTS: High-Quality, Multi-Voice Text-to-Speech with Realistic Prosody and Open-Source Flexibility 14737

Tortoise-TTS is an open-source text-to-speech (TTS) system designed for one core purpose: generating expressive, natural-sounding speech with strong multi-voice capabilities.…

12/26/2025Speech Synthesis, Text-to-Speech, Voice Cloning
InvSR: High-Quality Image Super-Resolution in 1–5 Steps Using Diffusion Inversion

InvSR: High-Quality Image Super-Resolution in 1–5 Steps Using Diffusion Inversion 1341

Image super-resolution (SR) remains a critical capability across computer vision applications—from upscaling smartphone photos to enhancing AI-generated content (AIGC). However,…

12/26/2025AIGC Enhancement, Diffusion Models, Image Super-resolution
DeepSeek-V3: A High-Performance, Cost-Efficient MoE Language Model That Delivers Closed-Source Power with Open-Source Flexibility

DeepSeek-V3: A High-Performance, Cost-Efficient MoE Language Model That Delivers Closed-Source Power with Open-Source Flexibility 100738

For technical decision-makers evaluating large language models (LLMs) for real-world applications, balancing raw capability, inference cost, training efficiency, and deployment…

12/26/2025Code Generation, Mathematical Reasoning, Multilingual Language Modeling
LLaMA-Adapter: Efficiently Transform LLaMA into Instruction-Following or Multimodal AI with Just 1.2M Parameters

LLaMA-Adapter: Efficiently Transform LLaMA into Instruction-Following or Multimodal AI with Just 1.2M Parameters 5907

If you’re working on a project that requires a capable language model—but lack the GPU budget, time, or infrastructure for…

12/26/2025Instruction Tuning, Multimodal Learning, Parameter-Efficient Fine-Tuning
CoOp: Adapt Vision-Language Models Like CLIP to Your Task with Just a Few Labels—No Full Fine-Tuning Needed

CoOp: Adapt Vision-Language Models Like CLIP to Your Task with Just a Few Labels—No Full Fine-Tuning Needed 2134

Imagine you have access to a powerful pre-trained vision-language model like CLIP—capable of understanding both images and text—but you need…

12/26/2025Few-shot Image Classification, Prompt Learning, Vision-language Model Adaptation
In-Context LoRA: Generate High-Fidelity Multi-Image Sets with Minimal Data and No Model Changes

In-Context LoRA: Generate High-Fidelity Multi-Image Sets with Minimal Data and No Model Changes 2024

Imagine you need to generate a cohesive set of images—say, a film storyboard, a series of product design mockups, or…

12/26/2025Diffusion Transformers, In-context Learning, Multi-image Generation
BiRefNet: High-Resolution Binary Image Segmentation with Pixel-Perfect Detail and Cross-Task Generalization

BiRefNet: High-Resolution Binary Image Segmentation with Pixel-Perfect Detail and Cross-Task Generalization 2977

BiRefNet (Bilateral Reference Network) is a state-of-the-art deep learning model designed specifically for high-resolution dichotomous image segmentation (DIS)—a task that…

12/26/2025Dichotomous Image Segmentation, Image Matting, Salient Object Detection
UltraChat: Train Powerful Open-Source Chat Models with 1.5M High-Quality, Privacy-Safe AI Dialogues

UltraChat: Train Powerful Open-Source Chat Models with 1.5M High-Quality, Privacy-Safe AI Dialogues 2721

If you’re a technical decision-maker evaluating options for building or fine-tuning a conversational AI system, you know that high-quality instruction-following…

12/26/2025Conversational AI, Instruction Tuning, Multi-turn Dialogue Modeling
AI-Scientist: Automate End-to-End Machine Learning Research from Idea to Peer-Reviewed Paper

AI-Scientist: Automate End-to-End Machine Learning Research from Idea to Peer-Reviewed Paper 11593

Imagine a system that doesn’t just assist scientists—but acts as one. It generates novel research hypotheses, writes executable code, runs…

12/26/2025Automated Scientific Discovery, End-to-end ML Research Automation, LLM-driven Experimentation

Posts pagination

Previous 1 … 15 16 17 … 43 Next
Copyright © 2026 PaperCodex.
  • Facebook
  • YouTube
  • Twitter

PaperCodex