PaperCodex | Page 23 of 53 | Find Awesome Papers and Source Codes

MacNet: Scale Multi-Agent LLM Collaboration Beyond Linear Workflows with Custom Topologies

MacNet: Scale Multi-Agent LLM Collaboration Beyond Linear Workflows with Custom Topologies 27867

Traditional multi-agent systems powered by large language models (LLMs) often follow rigid, sequential workflows—like a single assembly line where each…

12/27/2025Multi-agent Collaboration, Software Automation, Structured Creative Generation

Tree of Thoughts: Unlock Strategic Reasoning in LLMs for Complex Problem Solving

Tree of Thoughts: Unlock Strategic Reasoning in LLMs for Complex Problem Solving 5714

Large language models (LLMs) have transformed how we approach tasks ranging from coding assistance to content generation. Yet, their standard…

12/27/2025Planning, Reasoning, Search

RPG-DiffusionMaster: Generate Complex, Compositional Images from Text—No Retraining Needed

RPG-DiffusionMaster: Generate Complex, Compositional Images from Text—No Retraining Needed 1823

Text-to-image generation has made remarkable strides, yet even state-of-the-art models like DALL·E 3 or Stable Diffusion XL (SDXL) often stumble…

12/27/2025Compositional Image Synthesis, Multimodal Reasoning, Text-to-Image Generation

InternVideo: Build Powerful Video-Language AI Without Massive Compute or Data

InternVideo: Build Powerful Video-Language AI Without Massive Compute or Data 2131

Building capable video-language AI systems has long been a resource-intensive endeavor—requiring vast video datasets, weeks of training on dozens of…

12/27/2025Video Question Answering, Video-text Retrieval, Zero-shot Video Classification

LyCORIS: Customize Stable Diffusion Without Retraining the Whole Model – Flexible, Lightweight Fine-Tuning for Text-to-Image Generation

LyCORIS: Customize Stable Diffusion Without Retraining the Whole Model – Flexible, Lightweight Fine-Tuning for Text-to-Image Generation 2413

If you’re working with text-to-image models like Stable Diffusion, you’ve likely faced the trade-off between customization and efficiency. Full fine-tuning…

12/27/2025Model Customization, Parameter-Efficient Fine-Tuning, Text-to-Image Generation

EvalPlus: Rigorously Evaluate LLM-Generated Code with 80× More Test Cases and Realistic Performance Metrics

EvalPlus: Rigorously Evaluate LLM-Generated Code with 80× More Test Cases and Realistic Performance Metrics 1652

When large language models (LLMs) generate code, how do you know it’s actually correct? Traditional code evaluation benchmarks like HumanEval…

12/27/2025Code Efficiency Benchmarking, Code Generation Evaluation, Functional Correctness Testing

Personalize-SAM: One-Shot Personalized Segmentation Without Training for Photos, Videos, and Generative AI Workflows

Personalize-SAM: One-Shot Personalized Segmentation Without Training for Photos, Videos, and Generative AI Workflows 1638

Imagine you have a photo album filled with images of your dog—but you want to automatically isolate your pet in…

12/27/2025Image Segmentation, One-shot Learning, Video Object Segmentation

CRATE: Interpretable, Parameter-Efficient Vision Transformers for Structured Unsupervised Learning

CRATE: Interpretable, Parameter-Efficient Vision Transformers for Structured Unsupervised Learning 1245

In an era where deep learning models grow ever larger and more opaque, the demand for interpretable, efficient, and theoretically…

12/27/2025Computer Vision, Representation Learning, Self-supervised Learning

NeuralForecast: Accurate, Easy-to-Use Neural Time Series Forecasting for Real-World Applications

NeuralForecast: Accurate, Easy-to-Use Neural Time Series Forecasting for Real-World Applications 3883

Time series forecasting remains a core challenge across industries—from retail and energy to finance and logistics. While deep learning has…

12/27/2025Multivariate Forecasting, Probabilistic Forecasting, Time-series Forecasting

Qwen3-Omni: One Unified Model for Text, Image, Audio, and Video—Without Compromise

Qwen3-Omni: One Unified Model for Text, Image, Audio, and Video—Without Compromise 3063

Imagine a single AI model that natively understands and generates responses across text, images, audio, and video—all in real time,…

12/27/2025Audio Captioning, Multimodal Reasoning, Real-time Speech Synthesis