Awesome Zero-shot Image Classification Papers and Source Codes

MetaCLIP: Superior Vision-Language Models Through Transparent, High-Quality Data Curation 1692

If you’ve worked with OpenAI’s CLIP, you know its power—but also its opacity. CLIP revolutionized zero-shot vision-language understanding, yet it…

12/27/2025Contrastive Learning, Multilingual Vision-language Modeling, Zero-shot Image Classification

Perception Encoder: One Vision Model to Rule Image, Video, and Language Tasks – Without Task-Specific Training 1809

Perception Encoder (PE) redefines what’s possible with a single vision encoder. Unlike legacy approaches that demand different pretraining strategies for…

12/19/2025Dense Visual Prediction, Multimodal Visual Question Answering, Zero-shot Image Classification