finBERT  by ProsusAI

Financial sentiment analysis via fine-tuned BERT

created 5 years ago
1,766 stars

Top 24.8% on sourcepulse

GitHubView on GitHub
Project Summary

FinBERT provides a pre-trained BERT model specifically fine-tuned for financial sentiment analysis. It aims to improve sentiment classification accuracy on financial texts by leveraging a large financial corpus and a specialized training approach. This is beneficial for researchers and developers working with financial news, reports, or social media data.

How It Works

FinBERT builds upon the BERT architecture by further training it on a large financial corpus (Reuters TRC2 subset) for language model adaptation, followed by fine-tuning on the Financial PhraseBank dataset for sentiment classification. This domain-specific pre-training and fine-tuning approach is designed to capture the nuances of financial language, leading to more accurate sentiment predictions compared to general-purpose NLP models.

Quick Start & Requirements

  • Install dependencies via Conda: conda env create -f environment.yml and conda activate finbert.
  • Models are available on Hugging Face or can be downloaded and placed in a local directory.
  • Requires Python and Conda. Specific hardware requirements are not detailed but expect typical NLP model resource needs.
  • Official Hugging Face model hub link: https://huggingface.co/ProsusAI/finbert-tone

Highlighted Details

  • Fine-tuned on Financial PhraseBank for sentiment analysis.
  • Further pre-trained on a subset of the Reuters TRC2 dataset.
  • Offers a predict.py script for easy sentiment prediction on text files.
  • Training notebook (finbert_training.ipynb) is provided for custom training.

Maintenance & Community

  • This is an outcome of an intern research project; not an official Prosus product.
  • Contact: Dogu Araci (dogu.araci[at]prosus[dot]com) and Zulkuf Genc (zulkuf.genc[at]prosus[dot]com).

Licensing & Compatibility

  • The README does not explicitly state a license. The project uses pytorch_pretrained_bert, an older version of Hugging Face's transformers library. Compatibility with commercial or closed-source projects is not specified.

Limitations & Caveats

The project relies on an older library (pytorch_pretrained_bert) which is noted as a priority for migration to the newer transformers library. The TRC2 dataset used for language model training is not publicly available, requiring a separate application for access.

Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
123 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.