Skip to content

PaperCodex

Subscribe

Optical Character Recognition (OCR)

Mini-Monkey: Fixing Fragmented Vision in Lightweight Multimodal Models with Smart Multi-Scale Cropping

Mini-Monkey: Fixing Fragmented Vision in Lightweight Multimodal Models with Smart Multi-Scale Cropping 1923

When it comes to deploying multimodal large language models (MLLMs) in real-world applications—especially on cost-sensitive or edge devices—lightweight models are…

12/22/2025Document Understanding, Multimodal Reasoning, Optical Character Recognition (OCR)
PP-FormulaNet: High-Accuracy and High-Speed Math Formula Recognition for Document Intelligence

PP-FormulaNet: High-Accuracy and High-Speed Math Formula Recognition for Document Intelligence 5930

In the world of scientific publishing, academic research, and educational technology, one persistent bottleneck remains: converting handwritten or printed mathematical…

12/13/2025Document Intelligence, Formula Recognition, Optical Character Recognition (OCR)
MinerU: High-Precision Open-Source Document Parsing for Real-World PDFs, Tables, and Formulas

MinerU: High-Precision Open-Source Document Parsing for Real-World PDFs, Tables, and Formulas 50296

Converting real-world documents—especially PDFs containing mixed content like equations, tables, multi-column layouts, and scanned text—into clean, structured, machine-readable formats remains…

12/12/202512/12/2025Document Parsing, Multimodal Understanding, Optical Character Recognition (OCR)
MonkeyOCR: High-Accuracy Document Parsing for Complex Layouts with Tables, Formulas, and Multilingual Text—Fast, Lightweight, and Deployable

MonkeyOCR: High-Accuracy Document Parsing for Complex Layouts with Tables, Formulas, and Multilingual Text—Fast, Lightweight, and Deployable 6354

Parsing complex documents—especially those containing tables, mathematical formulas, mixed layouts, or multilingual content—remains a persistent challenge in real-world AI applications.…

12/11/202512/15/2025Document Parsing, Optical Character Recognition (OCR), vision-language modeling
Copyright © 2026 PaperCodex.
  • Facebook
  • YouTube
  • Twitter

PaperCodex