FinBERT  by yya518

BERT model for financial NLP tasks

created 5 years ago
613 stars

Top 54.5% on sourcepulse

GitHubView on GitHub
Project Summary

FinBERT is a BERT language model pre-trained on a large corpus of financial communications, designed to advance financial Natural Language Processing (NLP) research and applications. It offers specialized models for sentiment analysis, ESG classification, and forward-looking statement (FLS) classification, outperforming traditional and other deep learning models on these tasks.

How It Works

FinBERT leverages the BERT architecture, pre-trained on 4.9 billion tokens from corporate reports (10-K & 10-Q), earnings call transcripts, and analyst reports. This extensive financial corpus allows FinBERT to capture domain-specific language nuances. The project also provides fine-tuned versions of the model for specific NLP tasks, demonstrating state-of-the-art performance.

Quick Start & Requirements

  • Install/Run: Use Huggingface's transformers library.
  • Prerequisites: Python, Huggingface transformers, torch, numpy. No specific hardware requirements beyond standard ML inference.
  • Demo: FinBERT-demo.ipynb and finetune.ipynb are provided in the repository.
  • More Info: FinBERT.AI

Highlighted Details

  • Offers four pre-trained FinBERT versions: FinVocab-Uncased (recommended), FinVocab-Cased, BaseVocab-Uncased, and BaseVocab-Cased.
  • Fine-tuned models available on Huggingface for sentiment, ESG, and FLS classification.
  • Achieves state-of-the-art performance on various financial NLP tasks.
  • Pre-trained on a 4.9B token financial corpus including 10-K/10-Q reports, earnings calls, and analyst reports.

Maintenance & Community

  • Migrated to Huggingface in July 2021.
  • Contact: imyiyang@ust.hk or GitHub issues.

Licensing & Compatibility

  • The README does not explicitly state a license. The code examples are compatible with Huggingface's transformers library, which typically uses Apache 2.0.

Limitations & Caveats

The project's license is not explicitly stated in the README, which may pose a risk for commercial use or closed-source integration.

Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
15 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.