Skip to content

PaperCodex

Subscribe
DyVal: Dynamic, Contamination-Free Evaluation of LLM Reasoning Capabilities

DyVal: Dynamic, Contamination-Free Evaluation of LLM Reasoning Capabilities 2726

Evaluating large language models (LLMs) has become increasingly challenging. Traditional benchmarks—like MMLU, GSM8K, or Big-Bench Hard—are static, fixed in complexity,…

12/19/2025Dynamic Benchmarking, LLM Robustness Testing, Reasoning Evaluation
Caption Anything: Interactive, Multimodal Image Captioning Controlled by You

Caption Anything: Interactive, Multimodal Image Captioning Controlled by You 1770

Traditional image captioning systems produce static, one-size-fits-all descriptions—often generic, inflexible, and disconnected from actual user intent. What if you could…

12/19/2025Image Captioning, Multimodal Control, vision-language modeling
OmniParser V2: One Unified Model for Text Spotting, Table Recognition, and Document Understanding

OmniParser V2: One Unified Model for Text Spotting, Table Recognition, and Document Understanding 1800

In today’s data-driven world, businesses and researchers routinely process documents—scanned invoices, forms, tables, and receipts—to extract structured information. Traditionally, this…

12/19/2025Document Understanding, Multimodal Document Processing, Visual Text Parsing
ManimML: Animate Machine Learning Architectures Directly from Code—No Design Skills Needed

ManimML: Animate Machine Learning Architectures Directly from Code—No Design Skills Needed 3269

As machine learning models grow increasingly complex—from deep convolutional networks to attention-based architectures—the ability to clearly communicate how they work…

12/19/2025Educational Animation, Model Explanation, Neural Network Visualization
Code-Optimise: Boost Code Correctness and Runtime Efficiency Without Trade-offs

Code-Optimise: Boost Code Correctness and Runtime Efficiency Without Trade-offs 2692

Modern code language models (CLMs) excel at generating functionally correct programs—but often at the cost of runtime efficiency. Conversely, efforts…

12/19/2025Code Generation, Model Optimization, Preference-based Learning
FederatedScope-LLM: Collaboratively Fine-Tune Large Language Models Without Sharing Private Data

FederatedScope-LLM: Collaboratively Fine-Tune Large Language Models Without Sharing Private Data 1491

In today’s data-sensitive world, organizations increasingly want to harness the power of large language models (LLMs) while complying with strict…

12/19/2025Federated Learning, Parameter-Efficient Fine-Tuning, Privacy-Preserving NLP
HippoRAG: Neurobiologically Inspired Long-Term Memory for LLMs That Solves Multi-Hop Reasoning and Continual Knowledge Integration

HippoRAG: Neurobiologically Inspired Long-Term Memory for LLMs That Solves Multi-Hop Reasoning and Continual Knowledge Integration 3056

Retrieval-Augmented Generation (RAG) has become a go-to architecture for grounding large language models (LLMs) in external knowledge. Yet, even the…

12/19/2025Continual Knowledge Integration, Multi-hop Question Answering, Retrieval-Augmented Generation
DiffBIR: Unified Blind Image Restoration with Realistic Detail Recovery Across Super-Resolution, Face Enhancement, and Denoising

DiffBIR: Unified Blind Image Restoration with Realistic Detail Recovery Across Super-Resolution, Face Enhancement, and Denoising 3971

Blind image restoration—recovering high-quality images from degraded inputs without knowing the exact type or severity of degradation—is a longstanding challenge…

12/19/2025Blind Image Restoration, Face Restoration, Image Super-resolution
Bi’an: Detect RAG Hallucinations Accurately with a Bilingual Benchmark and Lightweight Judge Models

Bi’an: Detect RAG Hallucinations Accurately with a Bilingual Benchmark and Lightweight Judge Models 8343

Retrieval-Augmented Generation (RAG) has become a go-to strategy for grounding large language model (LLM) responses in real-world knowledge. By pulling…

12/19/2025Factuality Evaluation, Hallucination Detection, Retrieval-Augmented Generation
MiniCPM-V 4.5: GPT-4o-Level Vision Intelligence in an 8B Open-Source Model for Real-World Multimodal Tasks

MiniCPM-V 4.5: GPT-4o-Level Vision Intelligence in an 8B Open-Source Model for Real-World Multimodal Tasks 22368

Multimodal Large Language Models (MLLMs) promise to transform how machines understand images, videos, and text—but most top-performing models come with…

12/19/2025Efficient MLLM Deployment, Multimodal Reasoning, Vision-language Understanding

Posts pagination

Previous 1 … 26 27 28 … 43 Next
Copyright © 2026 PaperCodex.
  • Facebook
  • YouTube
  • Twitter

PaperCodex