DISC-FinLLM: A Specialized Chinese Financial LLM for Accurate, Context-Aware Financial Intelligence

DISC-FinLLM: A Specialized Chinese Financial LLM for Accurate, Context-Aware Financial Intelligence
Paper & Code
DISC-FinLLM: A Chinese Financial Large Language Model based on Multiple Experts Fine-tuning
2023 FudanDISC/DISC-FinLLM
818

If you’re building AI-powered tools for the Chinese financial sector—whether for banking, fintech, investment research, or regulatory compliance—you’ve likely run into a persistent problem: general-purpose large language models (LLMs) often lack the precision, domain knowledge, and contextual awareness needed to handle real-world financial tasks in Chinese. Enter DISC-FinLLM, a purpose-built Chinese financial large language model developed by the Data Intelligence and Social Computing Lab (Fudan-DISC) at Fudan University.

Unlike generic LLMs fine-tuned on broad internet data, DISC-FinLLM is engineered specifically for the Chinese financial ecosystem. It integrates expert-level capabilities in financial consulting, text analysis, quantitative computation, and retrieval-augmented Q&A—all trained on a high-quality, 246k-sample instruction dataset called DISC-Fin-SFT. This allows technical teams to deploy a model that not only understands financial terminology and local regulations but can also perform complex calculations and reason over up-to-date financial documents.

For project and technical decision-makers seeking a reliable, ready-to-deploy foundation for Chinese financial AI applications, DISC-FinLLM offers a compelling alternative to building from scratch or forcing generic models to “guess” their way through nuanced financial scenarios.

Four Expert Modules, One Cohesive Financial Intelligence System

DISC-FinLLM’s architecture is built around a Multiple Experts Fine-tuning Framework, which trains four specialized LoRA (Low-Rank Adaptation) modules on distinct financial subtasks. Each module addresses specific pain points faced by developers and financial product teams:

Financial Consulting: Multi-Turn Dialogue with Domain Expertise

This module enables DISC-FinLLM to engage in natural, multi-turn conversations about Chinese financial topics—such as explaining “non-performing assets” or discussing stock market mechanisms. Trained on translated and regenerated financial QA pairs, glossary-based definitions, and forum-style dialogues, it delivers responses that align with local market practices, regulatory norms, and linguistic conventions.

Financial Text Analysis: NLP Tasks Tailored to Chinese Financial Documents

From sentiment analysis of earnings news to named entity recognition in regulatory filings, this module handles core NLP tasks across Chinese financial texts. It’s trained on a curated mix of open-source Chinese financial NLP datasets—including FPB (financial sentiment), CCKS-NEC (entity extraction), and industry reports—enabling accurate information extraction, classification, and summarization.

Financial Computation: Reliable Math with Tool Integration

Financial decisions often hinge on precise calculations. DISC-FinLLM’s computation module supports four integrated tools:

  • Expression calculator (e.g., ROI, growth rates)
  • Equation solver (for modeling scenarios)
  • Statistic tool (mean, variance, sample analysis)
  • Probability table lookup (e.g., standard normal distribution values)

The model learns to invoke these tools conditionally by generating structured commands like [Calculator(expression)-result], ensuring numerical accuracy even for complex formulas like Black-Scholes option pricing or EDF default probability models.

Retrieval-Augmented Financial Q&A: Contextual Answers from Real Documents

When users ask about policy changes, market trends, or investment advice, DISC-FinLLM can ground its responses in real financial sources. This module uses Chain-of-Retrieval (CoR) prompting to generate answers based on a knowledge base of 18k analyst reports and 69k financial news articles. It excels at tasks like industry analysis, policy interpretation, and strategic recommendations—without hallucinating unsupported claims.

Practical Use Cases for Technical Teams

DISC-FinLLM is designed for immediate integration into real-world systems. Here are several high-impact scenarios where it delivers value:

  • Financial Chatbots: Power customer-facing assistants that explain products, regulations, or market movements in accurate, compliant Chinese.
  • Automated Report Summarization: Extract key insights from earnings calls, regulatory filings, or news digests using the text analysis module.
  • Investment Research Assistants: Support analysts by computing financial ratios, interpreting valuation models, or retrieving relevant market commentary.
  • Internal Knowledge Bases: Enhance enterprise search tools with retrieval-augmented generation that answers domain-specific questions using internal or public financial documents.

Critically, all these applications are optimized for the Chinese financial context—including local terminology, regulatory frameworks (e.g., China Securities Regulatory Commission guidelines), and data sources like East Money reports. This localization is a key differentiator from Western-trained finance LLMs.

Flexible Deployment: Full Model or Lightweight LoRA Adapters

DISC-FinLLM offers two deployment modes to suit different resource and use-case requirements:

  • Full Fine-Tuned Model: A complete version of Baichuan-13B-Chat fine-tuned on the entire DISC-Fin-SFT dataset. Ideal for general-purpose financial AI applications.
  • LoRA Adapters: Four lightweight, task-specific adapters that can be loaded on-demand atop the base Baichuan-13B-Chat model. For example, you can load only the “computing” adapter when performing financial math, minimizing memory overhead.

Integration is straightforward:

  • Use the Python API for programmatic inference
  • Run the CLI demo for quick testing
  • Launch a Streamlit web demo for prototyping UIs

The model also supports standard Baichuan-13B tooling, including INT8/INT4 quantization for CPU or low-resource GPU deployment—an important consideration for enterprise environments with hardware constraints.

Performance Validated Across Financial Benchmarks

DISC-FinLLM isn’t just theoretically sound—it’s rigorously evaluated using the DISC-Fin-Eval Benchmark, which covers four dimensions:

  1. Financial NLP Tasks: Outperforms baselines like Baichuan-13B-Chat and ChatGLM on metrics like F1 and ROUGE across sentiment analysis, relation extraction, and summarization.
  2. Human-like Exam Questions: Achieves higher accuracy than most open-source LLMs on Chinese finance, economics, accounting, and auditing multiple-choice questions.
  3. Financial Computation: Scores 35% on both formula and result correctness in a 100-question dataset of real-world financial calculations—nearly double Baichuan-13B-Chat’s performance.
  4. Real-Time Analysis: Shows improved relevance, usefulness, and reasoning quality over the base model when answering time-sensitive financial queries using retrieved documents.

These results confirm that DISC-FinLLM delivers measurable improvements across the financial AI stack.

Key Limitations and Responsible Use

While powerful, DISC-FinLLM has important boundaries:

  • Chinese-language only: Not suitable for English or multilingual financial applications.
  • Not a financial advisor: Outputs should be treated as decision-support aids, not professional advice.
  • China-specific focus: May underperform on global finance topics, non-Chinese regulations, or cross-border investment scenarios.

The project maintainers explicitly state that DISC-FinLLM should not replace human experts and must be used with critical evaluation.

Summary

DISC-FinLLM fills a critical gap in the AI landscape: a high-performance, open-source LLM purpose-built for the Chinese financial domain. By combining expert fine-tuning, tool-augmented computation, and retrieval grounding, it enables technical teams to build accurate, context-aware financial applications without months of data collection and training. Whether you’re prototyping a fintech product or enhancing an internal analytics platform, DISC-FinLLM provides a robust, evaluated, and deployment-ready foundation.

All code, datasets, evaluation benchmark (DISC-Fin-Eval), and pre-trained models are publicly available on GitHub and Hugging Face, allowing immediate experimentation and integration.