finBERT by ProsusAI

Financial sentiment analysis via fine-tuned BERT

Created 6 years ago

1,958 stars

Top 22.2% on SourcePulse

Project Summary

FinBERT provides a pre-trained BERT model specifically fine-tuned for financial sentiment analysis. It aims to improve sentiment classification accuracy on financial texts by leveraging a large financial corpus and a specialized training approach. This is beneficial for researchers and developers working with financial news, reports, or social media data.

How It Works

FinBERT builds upon the BERT architecture by further training it on a large financial corpus (Reuters TRC2 subset) for language model adaptation, followed by fine-tuning on the Financial PhraseBank dataset for sentiment classification. This domain-specific pre-training and fine-tuning approach is designed to capture the nuances of financial language, leading to more accurate sentiment predictions compared to general-purpose NLP models.

Quick Start & Requirements

Install dependencies via Conda: conda env create -f environment.yml and conda activate finbert.
Models are available on Hugging Face or can be downloaded and placed in a local directory.
Requires Python and Conda. Specific hardware requirements are not detailed but expect typical NLP model resource needs.
Official Hugging Face model hub link: https://huggingface.co/ProsusAI/finbert-tone

Highlighted Details

Fine-tuned on Financial PhraseBank for sentiment analysis.
Further pre-trained on a subset of the Reuters TRC2 dataset.
Offers a predict.py script for easy sentiment prediction on text files.
Training notebook (finbert_training.ipynb) is provided for custom training.

Maintenance & Community

This is an outcome of an intern research project; not an official Prosus product.
Contact: Dogu Araci (dogu.araci[at]prosus[dot]com) and Zulkuf Genc (zulkuf.genc[at]prosus[dot]com).

Licensing & Compatibility

The README does not explicitly state a license. The project uses pytorch_pretrained_bert, an older version of Hugging Face's transformers library. Compatibility with commercial or closed-source projects is not specified.

Limitations & Caveats

The project relies on an older library (pytorch_pretrained_bert) which is noted as a priority for migration to the newer transformers library. The TRC2 dataset used for language model training is not publicly available, requiring a separate application for access.

finBERT by ProsusAI

Explore Similar Projects

SentimentAnalysis by barissayil

PrimoAgent by ivebotunac

BERT_Chinese_Classification by renxingkai

awesome-sentiment-analysis by laugustyniak

FinBERT by valuesimplex

ChineseTextClassifier by ami66

FinBERT by yya518

awesome-sentiment-analysis by xiamx

sentiment_analysis_fine_grain by brightmart

sentiment-discovery by NVIDIA

generating-reviews-discovering-sentiment by openai

vaderSentiment by cjhutto