FinLLMs by adlnlp

Curated list of resources for Large Language Models in Finance (FinLLMs)

Created 2 years ago

382 stars

Top 74.3% on SourcePulse

Project Summary

This repository serves as a comprehensive resource hub for Large Language Models (LLMs) applied to the finance domain (FinLLMs). It targets researchers and practitioners interested in the evolution, techniques, benchmarks, and datasets within financial Natural Language Processing (NLP), offering a curated collection of papers, models, and tasks.

How It Works

The project tracks the progression of language models from general-domain LLMs (e.g., GPT, BERT) to specialized Financial Pre-trained Language Models (FinPLMs) and subsequently to FinLLMs. It categorizes techniques including continual pre-training, domain-specific pre-training from scratch, mixed-domain pre-training, and instruction fine-tuning with prompt engineering, highlighting key models like FinBERT, FLANG, BloombergGPT, FinMA, InvestLM, and FinGPT.

Quick Start & Requirements

This repository is primarily a curated list of resources and does not have a direct installation or execution command. It links to various external projects and datasets, each with its own requirements.

Highlighted Details

Comprehensive survey of FinLLMs, covering history, techniques, evaluation, and opportunities/challenges.
Detailed breakdown of benchmark tasks (Sentiment Analysis, Text Classification, NER, QA, SMP, Summarization) and advanced tasks (Relation Extraction, Event Detection, etc.) with associated datasets and papers.
Links to numerous financial NLP workshops and programs for further engagement.

Maintenance & Community

The project is based on a survey paper accepted at Neural Computing and Applications 2025 and is stated to be actively updated. Links to external GitHub repositories and HuggingFace models are provided for specific FinLLM projects.

Licensing & Compatibility

The repository itself does not specify a license. However, it links to various open-source models and datasets, each with its own licensing terms. Users must consult the licenses of individual linked resources for compatibility and usage restrictions.

Limitations & Caveats

This repository is a curated list and does not provide executable code or pre-trained models directly. Users must navigate to external links for model access, code execution, and dataset downloads, each potentially having its own setup requirements and limitations.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

10 stars in the last 30 days