Fin-R1 by SUFE-AIFLM-Lab

Finance LLM for complex reasoning, driven by reinforcement learning

Created 10 months ago

728 stars

Top 47.5% on SourcePulse

Project Summary

Fin-R1 is a 7B parameter large language model specifically designed for complex financial reasoning tasks. Developed by SUFE-AIFLM-Lab and Caiyue Xingchen, it aims to provide advanced financial analysis, code generation, risk management, and compliance capabilities for financial professionals. The model achieves state-of-the-art performance on several financial benchmarks, outperforming similarly sized models and even larger distilled models in specific areas.

How It Works

Fin-R1 is built upon the Qwen2.5-7B-Instruct base model and enhanced through a two-stage fine-tuning process. This involves supervised fine-tuning (SFT) on a custom 60k-entry financial reasoning dataset (Fin-R1-Data), which was meticulously curated using a novel dual-stage data screening method combining rule-based matching, Qwen2.5-72B-Instruct for answer accuracy, and deep validation of reasoning chains for logical consistency and terminology compliance. Subsequently, the model undergoes reinforcement learning (RL) using the GRPO algorithm with format and accuracy rewards, incorporating a model-based verifier (Qwen2.5-Max) to refine output quality and generalization.

Quick Start & Requirements

Installation: pip install vllm
Model Download: Clone from Hugging Face: git clone https://huggingface.co/SUFE-AIFLM-Lab/Fin-R1
Prerequisites: GPU(s) required for serving. The example uses tensor-parallel-size 2 and gpu-memory-utilization 0.9.
Serving: vllm serve "/path/Fin-R1" --host 0.0.0.0 --port 8000 --gpu-memory-utilization 0.9 --max-model-len 16384 --tensor-parallel-size 2 --served-model-name "Fin-R1"
Resources: Requires significant GPU memory and compute for serving.
Documentation: README_en.md, Technical Report

Highlighted Details

Achieves 75.2 average score across benchmarks, outperforming larger distilled models and ranking second overall, only 3.0 points behind the 671B DeepSeek-R1.
Tops FinQA (76.0) and ConvFinQA (85.0) benchmarks, demonstrating strong performance in table-based numerical reasoning and multi-turn interaction.
Fin-R1-Data dataset creation involved a unique "answer + reasoning" dual-quality scoring method for data screening.
The model supports both Chinese and English financial contexts.

Maintenance & Community

Developed by Shanghai University of Finance and Economics (SUFE) AIFLM-Lab and Caiyue Xingchen.
Contact: zhang.liwen@shufe.edu.cn

Licensing & Compatibility

License: Apache 2.0.
Compatible with commercial use.

Limitations & Caveats

The model's outputs are for reference only and should not replace professional financial advice. Users are encouraged to apply critical thinking and their own expertise when using the model's suggestions.

Fin-R1 by SUFE-AIFLM-Lab

Explore Similar Projects

ReasonFlux by Gen-Verse

Tina by shangshang-wang

POLARIS by ChenxinAn-fdu

understand-r1-zero by sail-sg

Cornucopia-LLaMA-Fin-Chinese by jerry1993-tech

AlphaFin by AlphaFin-proj

M_GRPO by baibizhe

PRIME by PRIME-RL

train-deepseek-r1 by FareedKhan-dev

rStar by microsoft

simpleRL-reason by hkust-nlp

DeepSeek-R1 by deepseek-ai