Skip to content

PaperCodex

Subscribe
GCNet: Boost Vision Models with Lightweight Global Context for Better Accuracy and Efficiency

GCNet: Boost Vision Models with Lightweight Global Context for Better Accuracy and Efficiency 1217

If you’ve worked on computer vision tasks like object detection or instance segmentation, you’ve likely encountered the challenge of modeling…

12/27/2025Global Context Modeling, Instance Segmentation, Object Detection
GCOPTER: Real-Time, High-Fidelity Multicopter Trajectory Planning with Geometric and Dynamic Constraints

GCOPTER: Real-Time, High-Fidelity Multicopter Trajectory Planning with Geometric and Dynamic Constraints 1105

Autonomous multicopters—whether used in drone racing, delivery, inspection, or swarm coordination—face a persistent challenge: generating trajectories that are simultaneously smooth,…

12/27/2025Aerial Robotics, Motion Planning, Trajectory Optimization
LightningDiT: Break the Reconstruction-Generation Trade-Off with 21.8x Faster, SOTA Image Diffusion

LightningDiT: Break the Reconstruction-Generation Trade-Off with 21.8x Faster, SOTA Image Diffusion 1315

Latent diffusion models (LDMs) have become a cornerstone of modern high-fidelity image generation. However, a persistent challenge has limited their…

12/27/2025Diffusion Transformers, Image Generation, Latent Diffusion Models
PRIME: Boost LLM Reasoning with Token-Level Rewards—No Step-by-Step Labels Needed

PRIME: Boost LLM Reasoning with Token-Level Rewards—No Step-by-Step Labels Needed 1783

If you’re working to improve large language models (LLMs) on hard reasoning tasks—like math problem solving or competitive programming—you’ve likely…

12/27/2025Code Generation, Mathematical Reasoning, Reinforcement Learning
GANformer: Compositional, Controllable Image Generation with Fewer Training Steps

GANformer: Compositional, Controllable Image Generation with Fewer Training Steps 1342

Traditional generative adversarial networks (GANs) often act as “black boxes”—they produce compelling images but offer little insight into how those…

12/27/2025Compositional Scene Modeling, Image Generation, Layout-to-image Synthesis
FlagEmbedding: High-Performance, Task-Aware Text Embeddings for Multilingual RAG and Semantic Search

FlagEmbedding: High-Performance, Task-Aware Text Embeddings for Multilingual RAG and Semantic Search 10677

Modern AI applications—from customer support chatbots to enterprise knowledge retrieval—rely heavily on high-quality text embeddings to power semantic search and…

12/27/2025Retrieval-Augmented Generation, Semantic Search, Text Embedding
MacNet: Scale Multi-Agent LLM Collaboration Beyond Linear Workflows with Custom Topologies

MacNet: Scale Multi-Agent LLM Collaboration Beyond Linear Workflows with Custom Topologies 27867

Traditional multi-agent systems powered by large language models (LLMs) often follow rigid, sequential workflows—like a single assembly line where each…

12/27/2025Multi-agent Collaboration, Software Automation, Structured Creative Generation
Tree of Thoughts: Unlock Strategic Reasoning in LLMs for Complex Problem Solving

Tree of Thoughts: Unlock Strategic Reasoning in LLMs for Complex Problem Solving 5714

Large language models (LLMs) have transformed how we approach tasks ranging from coding assistance to content generation. Yet, their standard…

12/27/2025Planning, Reasoning, Search
RPG-DiffusionMaster: Generate Complex, Compositional Images from Text—No Retraining Needed

RPG-DiffusionMaster: Generate Complex, Compositional Images from Text—No Retraining Needed 1823

Text-to-image generation has made remarkable strides, yet even state-of-the-art models like DALL·E 3 or Stable Diffusion XL (SDXL) often stumble…

12/27/2025Compositional Image Synthesis, Multimodal Reasoning, Text-to-Image Generation
InternVideo: Build Powerful Video-Language AI Without Massive Compute or Data

InternVideo: Build Powerful Video-Language AI Without Massive Compute or Data 2131

Building capable video-language AI systems has long been a resource-intensive endeavor—requiring vast video datasets, weeks of training on dozens of…

12/27/2025Video Question Answering, Video-text Retrieval, Zero-shot Video Classification

Posts pagination

Previous 1 … 11 12 13 … 43 Next
Copyright © 2026 PaperCodex.
  • Facebook
  • YouTube
  • Twitter

PaperCodex