Financial analysis is rarely a solo endeavor. In real-world institutions—from investment banks to asset management firms—complex tasks like producing quarterly…
Retrieval-Augmented Generation
LMCache: Slash LLM Inference Latency and Multiply Throughput with Enterprise-Grade KV Cache Reuse 6375
Deploying large language models (LLMs) at scale introduces a familiar bottleneck: the growing size of Key-Value (KV) caches rapidly outpaces…
FlagEmbedding: High-Performance, Task-Aware Text Embeddings for Multilingual RAG and Semantic Search 10677
Modern AI applications—from customer support chatbots to enterprise knowledge retrieval—rely heavily on high-quality text embeddings to power semantic search and…
Search-R1: Train LLMs to Reason and Search Like Human Researchers Using Open-Source Reinforcement Learning 3614
In the rapidly evolving landscape of large language models (LLMs), a critical limitation persists: despite their impressive fluency, LLMs often…
HippoRAG: Neurobiologically Inspired Long-Term Memory for LLMs That Solves Multi-Hop Reasoning and Continual Knowledge Integration 3056
Retrieval-Augmented Generation (RAG) has become a go-to architecture for grounding large language models (LLMs) in external knowledge. Yet, even the…
Bi’an: Detect RAG Hallucinations Accurately with a Bilingual Benchmark and Lightweight Judge Models 8343
Retrieval-Augmented Generation (RAG) has become a go-to strategy for grounding large language model (LLM) responses in real-world knowledge. By pulling…
Search-o1: Boost Large Reasoning Models with On-Demand Knowledge Retrieval for Complex Problem Solving 1119
Large reasoning models (LRMs)—such as OpenAI’s o1—excel at multi-step logical reasoning, especially in science, math, and code-related tasks. But they…
VideoRAG: Unlock Long-Form Video Understanding with Retrieval-Augmented Generation for AI-Powered Insights 1356
Imagine being able to ask questions like “What did the professor say about quantum entanglement in Lecture 3?” or “Show…
WebThinker: Autonomous Web Research for Large Reasoning Models That Need Real-Time, Multi-Source Knowledge Synthesis 1366
In today’s fast-evolving information landscape, even the most advanced large reasoning models (LRMs)—such as OpenAI-o1 or DeepSeek-R1—are constrained by their…
FlashRAG: A Modular, Lightweight Toolkit for Reproducible and Efficient Retrieval-Augmented Generation Research 3208
Retrieval-Augmented Generation (RAG) has emerged as a cornerstone technique for enhancing the factual grounding, knowledge scope, and reasoning capabilities of…