FARM  by deepset-ai

NLP framework for transfer learning with BERT & Co

created 6 years ago
1,753 stars

Top 25.0% on sourcepulse

GitHubView on GitHub
Project Summary

FARM (Framework for Adapting Representation Models) is a Python library designed for simplifying transfer learning with transformer-based language models, particularly for Natural Language Processing (NLP) tasks like question answering, text classification, and named entity recognition. It targets developers and researchers seeking efficient model fine-tuning, parallelized preprocessing, and production-ready deployment.

How It Works

FARM employs a modular architecture, separating language models from prediction heads. This allows for easy swapping of models or combining multiple heads for multitask learning. It leverages HuggingFace's Transformers library and offers features like Automatic Mixed Precision (AMP) for faster training and parallelized data preprocessing for significant speedups. The framework also integrates experiment tracking via MLflow and provides tools for caching, checkpointing, and deployment.

Quick Start & Requirements

  • Install via pip: pip install farm
  • Recommended: git clone https://github.com/deepset-ai/FARM.git && cd FARM && pip install -r requirements.txt && pip install --editable .
  • Requires Python and PyTorch. GPU with CUDA is recommended for performance.
  • Official documentation: https://farm.deepset.ai/

Highlighted Details

  • Supports fine-tuning of BERT, RoBERTa, XLNet, ALBERT, DistilBERT, XLM-RoBERTa, ELECTRA, and MiniLM.
  • Offers parallelized preprocessing and AMP for up to 35% faster training.
  • Integrates MLflow for experiment tracking and provides a public MLflow server for testing.
  • Includes features for early stopping, handling imbalanced classes, cross-validation, and caching.
  • Supports training on AWS SageMaker, including cost-saving Managed Spot Instances.

Maintenance & Community

The core modeling parts of FARM have been migrated to the deepset-ai/haystack repository, and this FARM repo is no longer actively maintained. Development and support have moved to Haystack.

Licensing & Compatibility

  • Licensed under the MIT License.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

This repository is not actively maintained, with all development shifted to the deepset-ai/haystack project. Users seeking new features or bug fixes should refer to the Haystack repository.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 90 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

InternEvo by InternLM

1.0%
402
Lightweight training framework for model pre-training
created 1 year ago
updated 1 week ago
Starred by Jeremy Howard Jeremy Howard(Cofounder of fast.ai) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

SwissArmyTransformer by THUDM

0.3%
1k
Transformer library for flexible model development
created 3 years ago
updated 7 months ago
Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake) and Travis Fischer Travis Fischer(Founder of Agentic).

lingua by facebookresearch

0.1%
5k
LLM research codebase for training and inference
created 9 months ago
updated 2 weeks ago
Feedback? Help us improve.