fairseq  by facebookresearch

Sequence modeling toolkit for translation, language modeling, and text generation research

created 8 years ago
31,680 stars

Top 1.1% on sourcepulse

GitHubView on GitHub
Project Summary

Fairseq is a Python-based sequence modeling toolkit for researchers and developers, offering implementations of various state-of-the-art models for tasks like machine translation, summarization, and language modeling. It provides a flexible and extensible framework for training custom models and reproducing results from numerous influential research papers.

How It Works

Fairseq supports a wide range of architectures including CNNs, LSTMs, and various Transformer variants (e.g., Transformer-XL, Linformer, NormFormer). It leverages PyTorch for its backend and offers advanced training features like multi-GPU/multi-node parallelism, gradient accumulation, mixed-precision training, and parameter/optimizer state sharding for efficient handling of large models and datasets.

Quick Start & Requirements

  • Install: pip install --editable ./ (local development) or pip install fairseq (stable release).
  • Prerequisites: PyTorch >= 1.10.0, Python >= 3.8. NVIDIA GPU and NCCL required for training. Apex library recommended for faster training.
  • Setup: Local installation is straightforward. Pre-trained models are available via torch.hub.
  • Docs: Full documentation available.

Highlighted Details

  • Extensive library of reference implementations for over 30 influential sequence modeling papers.
  • Supports multiple generation strategies: beam search, diverse beam search, sampling, and lexically constrained decoding.
  • Integrates with xFormers for optimized Transformer performance.
  • Offers pre-trained models for translation and language modeling via torch.hub.
  • Adopted the Hydra configuration framework for flexible project setup.

Maintenance & Community

The project is actively maintained by Facebook AI Research. Updates are frequent, with recent additions including models for large-scale multilingual speech and direct speech-to-speech translation. Community engagement is facilitated via Twitter, a Facebook group, and a Google group.

Licensing & Compatibility

Fairseq is MIT-licensed, including its pre-trained models, allowing for commercial use and integration into closed-source projects.

Limitations & Caveats

While powerful, fairseq's extensive feature set and numerous examples can present a steep learning curve. Some advanced features or specific paper implementations might require careful configuration and understanding of the underlying research.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
6
Star History
382 stars in the last 90 days

Explore Similar Projects

Starred by Jeremy Howard Jeremy Howard(Cofounder of fast.ai) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

SwissArmyTransformer by THUDM

0.3%
1k
Transformer library for flexible model development
created 3 years ago
updated 7 months ago
Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), Abhishek Thakur Abhishek Thakur(World's First 4x Kaggle GrandMaster), and
5 more.

xlnet by zihangdai

0.0%
6k
Language model research paper using generalized autoregressive pretraining
created 6 years ago
updated 2 years ago
Starred by Lilian Weng Lilian Weng(Cofounder of Thinking Machines Lab), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
42 more.

transformers by huggingface

0.2%
148k
ML library for pretrained model inference and training
created 6 years ago
updated 10 hours ago
Feedback? Help us improve.