FARM by deepset-ai

NLP framework for transfer learning with BERT & Co

Created 6 years ago

1,752 stars

Top 24.2% on SourcePulse

View on GitHub

9 Experts Love This Project

Lewis Tunstall

Research Engineer at Hugging Face

Jan Oberhauser

Founder of n8n

Gabriel Almeida

Cofounder of Langflow

Lysandre Debut

Chief Open-Source Officer at Hugging Face

and 5 more!

Project Summary

FARM (Framework for Adapting Representation Models) is a Python library designed for simplifying transfer learning with transformer-based language models, particularly for Natural Language Processing (NLP) tasks like question answering, text classification, and named entity recognition. It targets developers and researchers seeking efficient model fine-tuning, parallelized preprocessing, and production-ready deployment.

How It Works

FARM employs a modular architecture, separating language models from prediction heads. This allows for easy swapping of models or combining multiple heads for multitask learning. It leverages HuggingFace's Transformers library and offers features like Automatic Mixed Precision (AMP) for faster training and parallelized data preprocessing for significant speedups. The framework also integrates experiment tracking via MLflow and provides tools for caching, checkpointing, and deployment.

Quick Start & Requirements

Install via pip: pip install farm
Recommended: git clone https://github.com/deepset-ai/FARM.git && cd FARM && pip install -r requirements.txt && pip install --editable .
Requires Python and PyTorch. GPU with CUDA is recommended for performance.
Official documentation: https://farm.deepset.ai/

Highlighted Details

Supports fine-tuning of BERT, RoBERTa, XLNet, ALBERT, DistilBERT, XLM-RoBERTa, ELECTRA, and MiniLM.
Offers parallelized preprocessing and AMP for up to 35% faster training.
Integrates MLflow for experiment tracking and provides a public MLflow server for testing.
Includes features for early stopping, handling imbalanced classes, cross-validation, and caching.
Supports training on AWS SageMaker, including cost-saving Managed Spot Instances.

Maintenance & Community

The core modeling parts of FARM have been migrated to the deepset-ai/haystack repository, and this FARM repo is no longer actively maintained. Development and support have moved to Haystack.

Licensing & Compatibility

Licensed under the MIT License.
Compatible with commercial use and closed-source linking.

Limitations & Caveats

This repository is not actively maintained, with all development shifted to the deepset-ai/haystack project. Users seeking new features or bug fixes should refer to the Haystack repository.

Health Check

Last Commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days