Skip to content

PaperCodex

Subscribe
RFBNet: High-Accuracy, Real-Time Object Detection Without Heavy Backbones

RFBNet: High-Accuracy, Real-Time Object Detection Without Heavy Backbones 1422

When building real-world computer vision systems—whether for autonomous drones, industrial inspection, or mobile apps—one of the toughest trade-offs is between…

12/27/2025Edge AI, Object Detection, Real-Time Inference
3DDFA_V2: Real-Time, CPU-Efficient 3D Face Alignment for Video and Edge Applications

3DDFA_V2: Real-Time, CPU-Efficient 3D Face Alignment for Video and Edge Applications 3081

If you’re building applications that require real-time 3D facial understanding—like video conferencing enhancements, augmented reality filters, biometric verification, or character…

12/27/20253D Face Alignment, Dense Facial Landmark Estimation, Real-time Face Tracking
Bunny: High-Performance Multimodal AI Without the Heavy Compute Burden

Bunny: High-Performance Multimodal AI Without the Heavy Compute Burden 1046

Multimodal Large Language Models (MLLMs) are transforming how machines understand and reason about visual content. Yet, their adoption remains out…

12/27/2025Efficient Inference, Multimodal Reasoning, vision-language modeling
Step-Video-T2V: Generate High-Quality, Long-Form Videos from Text in English and Chinese

Step-Video-T2V: Generate High-Quality, Long-Form Videos from Text in English and Chinese 3139

Step-Video-T2V is a state-of-the-art open-source text-to-video foundation model developed by StepFun AI. With 30 billion parameters and the ability to…

12/27/2025Multimodal Foundation Models, Text-to-Video Generation, Video Diffusion Models
GCNet: Boost Vision Models with Lightweight Global Context for Better Accuracy and Efficiency

GCNet: Boost Vision Models with Lightweight Global Context for Better Accuracy and Efficiency 1217

If you’ve worked on computer vision tasks like object detection or instance segmentation, you’ve likely encountered the challenge of modeling…

12/27/2025Global Context Modeling, Instance Segmentation, Object Detection
GCOPTER: Real-Time, High-Fidelity Multicopter Trajectory Planning with Geometric and Dynamic Constraints

GCOPTER: Real-Time, High-Fidelity Multicopter Trajectory Planning with Geometric and Dynamic Constraints 1105

Autonomous multicopters—whether used in drone racing, delivery, inspection, or swarm coordination—face a persistent challenge: generating trajectories that are simultaneously smooth,…

12/27/2025Aerial Robotics, Motion Planning, Trajectory Optimization
LightningDiT: Break the Reconstruction-Generation Trade-Off with 21.8x Faster, SOTA Image Diffusion

LightningDiT: Break the Reconstruction-Generation Trade-Off with 21.8x Faster, SOTA Image Diffusion 1315

Latent diffusion models (LDMs) have become a cornerstone of modern high-fidelity image generation. However, a persistent challenge has limited their…

12/27/2025Diffusion Transformers, Image Generation, Latent Diffusion Models
PRIME: Boost LLM Reasoning with Token-Level Rewards—No Step-by-Step Labels Needed

PRIME: Boost LLM Reasoning with Token-Level Rewards—No Step-by-Step Labels Needed 1783

If you’re working to improve large language models (LLMs) on hard reasoning tasks—like math problem solving or competitive programming—you’ve likely…

12/27/2025Code Generation, Mathematical Reasoning, Reinforcement Learning
GANformer: Compositional, Controllable Image Generation with Fewer Training Steps

GANformer: Compositional, Controllable Image Generation with Fewer Training Steps 1342

Traditional generative adversarial networks (GANs) often act as “black boxes”—they produce compelling images but offer little insight into how those…

12/27/2025Compositional Scene Modeling, Image Generation, Layout-to-image Synthesis
FlagEmbedding: High-Performance, Task-Aware Text Embeddings for Multilingual RAG and Semantic Search

FlagEmbedding: High-Performance, Task-Aware Text Embeddings for Multilingual RAG and Semantic Search 10677

Modern AI applications—from customer support chatbots to enterprise knowledge retrieval—rely heavily on high-quality text embeddings to power semantic search and…

12/27/2025Retrieval-Augmented Generation, Semantic Search, Text Embedding

Posts pagination

Previous 1 … 21 22 23 … 53 Next
Copyright © 2026 PaperCodex.
  • Facebook
  • YouTube
  • Twitter

PaperCodex