Awesome Hallucination Detection Papers and Source Codes

HaluEval: Detect and Benchmark LLM Hallucinations Across QA, Dialogue, and Summarization 536

Large language models (LLMs) like ChatGPT are transforming how we interact with AI—but they often “make things up.” These fabricated,…

01/13/2026Hallucination Detection, Knowledge-grounded Dialogue, Question Answering

$FacTool: Automatically Detect Factual Errors in LLM Outputs Across Code, Math, QA, and Scientific Writing$

FacTool: Automatically Detect Factual Errors in LLM Outputs Across Code, Math, QA, and Scientific Writing 899

Large language models (LLMs) like ChatGPT and GPT-4 have transformed how we generate text, write code, solve math problems, and…

01/13/2026Factuality Detection, Hallucination Detection, LLM Evaluation

Bi’an: Detect RAG Hallucinations Accurately with a Bilingual Benchmark and Lightweight Judge Models 8343

Retrieval-Augmented Generation (RAG) has become a go-to strategy for grounding large language model (LLM) responses in real-world knowledge. By pulling…

12/19/2025Factuality Evaluation, Hallucination Detection, Retrieval-Augmented Generation

UQLM: Detect LLM Hallucinations with Uncertainty Quantification—Confidence Scoring Made Practical 1079

Large Language Models (LLMs) are transforming how we build intelligent applications—from customer service bots to clinical decision support tools. Yet…

12/18/2025Hallucination Detection, LLM Reliability, Uncertainty Quantification