FinGLM by MetaGLM

FinGLM: Open-source project for AI in finance

Created 2 years ago

2,167 stars

Top 20.3% on SourcePulse

Project Summary

FinGLM is an open-source, public-benefit project aiming to advance "AI + Finance" by building a persistent financial large model. It focuses on enabling expert-level financial analysis of company annual reports through conversational AI, targeting developers and researchers interested in financial data analysis and LLM applications.

How It Works

The project processes company annual reports by converting PDFs to TXT, splitting data into categories (basic info, financial data, comprehensive info), performing calculations (e.g., cost rates, growth rates), and storing it in SQL, Mongo, or ES databases. Models are then fine-tuned using strategies like ptuningv2 or LoRA. The Q&A process involves user input, prompt generation, database querying, and answer synthesis, leveraging LLMs for information extraction and response generation.

Quick Start & Requirements

Dataset Download:
- PDFs (69GB): git clone http://www.modelscope.cn/datasets/modelscope/chatglm_llm_fintech_raw_dataset.git (requires Git LFS)
- TXTs: wget https://sail-moe.oss-cn-hangzhou.aliyuncs.com/open_data/hackathon_chatglm_fintech/alltxt.zip
- HTMLs: wget https://sail-moe.oss-cn-hangzhou.aliyuncs.com/open_data/hackathon_chatglm_fintech/allhtml.zip
ModelScope SDK: pip3 install "modelscope==1.7.2rc0"
Datasets Library: pip3 install datasets==2.13.0
Prerequisites: Python, Git LFS, ModelScope SDK, Datasets library.

Highlighted Details

Provides 70GB+ of annual report data (2019-2021) and 10,000+ manually annotated Q&A pairs.
Includes code and models from multiple winning teams of the SMP 2023 ChatGLM Financial Large Model Challenge.
Offers comprehensive learning tutorials covering data preprocessing, database usage, GLM utilization, prompt engineering, and model fine-tuning.
A dedicated fund of ¥100,000 and compute resources are available for project development.

Maintenance & Community

The project is maintained by a collective of teams and individuals, with active community engagement encouraged through application to join and regular communication channels. Resources for learning and contribution are readily available.

Licensing & Compatibility

The project itself appears to be open for research and non-commercial use. However, users must adhere to the specific licenses of underlying models, such as ChatGLM-6B, when using them for commercial purposes.

Limitations & Caveats

The project is primarily intended for research and non-commercial use. Commercial application is not recommended without careful adherence to the licensing terms of individual components, particularly the base LLMs.

FinGLM by MetaGLM

Explore Similar Projects

FinEval by SUFE-AIFLM-Lab

receipthero by Nutlope

introspect by defog-ai

FinQwen by Tongyi-EconML

GPT-InvestAR by UditGupta10

FinAnGPT by austin-starks

FinLLMs by adlnlp

financial-datasets by virattt

sec-parser by alphanome-ai

FinanceMCP by guangxiangdebizi

ai-financial-agent by virattt

awesome-ai-in-finance by georgezouq