Awesome Video Understanding Papers and Source Codes

LiteFlowNet: High-Accuracy Optical Flow Estimation with a Lightweight, Fast CNN for Real-World Applications 623

Optical flow estimation—the task of predicting per-pixel motion between consecutive video frames—is foundational in computer vision applications ranging from autonomous…

01/13/2026Motion Analysis, Optical Flow Estimation, Video Understanding

VideoMamba: Efficient Long- and Short-Term Video Understanding Without the Compute Overhead 1044

Video understanding has long been bottlenecked by two competing demands: capturing fine-grained local motion while simultaneously modeling long-range temporal dependencies.…

12/26/2025Action Recognition, Video Understanding, Video-text Retrieval

Show-o: One Unified Transformer for Multimodal Understanding and Generation Across Text, Images, and Videos 1809

In today’s AI landscape, developers and researchers often juggle separate models for vision, language, and video—each with its own architecture,…

12/18/2025Image Generation, Multimodal Understanding, Video Understanding

Video-ChatGPT: Enable Accurate, Detailed Video Understanding with Multimodal Conversational AI 1444

Video-ChatGPT is a state-of-the-art multimodal AI system that bridges the gap between video content and human-like conversation. Built by researchers…

12/17/2025Multimodal Dialogue, Video Question Answering, Video Understanding