fairseq by facebookresearch

Sequence modeling toolkit for translation, language modeling, and text generation research

Created 8 years ago

32,074 stars

Top 1.1% on SourcePulse

View on GitHub

47 Experts Love This Project

Lilian Weng

Cofounder of Thinking Machines Lab

Aravind Srinivas

Cofounder of Perplexity

Tim J. Baek

Founder of Open WebUI

Deepak Pathak

Cofounder of Skild AI; Professor at CMU

and 43 more!

Project Summary

Fairseq is a Python-based sequence modeling toolkit for researchers and developers, offering implementations of various state-of-the-art models for tasks like machine translation, summarization, and language modeling. It provides a flexible and extensible framework for training custom models and reproducing results from numerous influential research papers.

How It Works

Fairseq supports a wide range of architectures including CNNs, LSTMs, and various Transformer variants (e.g., Transformer-XL, Linformer, NormFormer). It leverages PyTorch for its backend and offers advanced training features like multi-GPU/multi-node parallelism, gradient accumulation, mixed-precision training, and parameter/optimizer state sharding for efficient handling of large models and datasets.

Quick Start & Requirements

Install: pip install --editable ./ (local development) or pip install fairseq (stable release).
Prerequisites: PyTorch >= 1.10.0, Python >= 3.8. NVIDIA GPU and NCCL required for training. Apex library recommended for faster training.
Setup: Local installation is straightforward. Pre-trained models are available via torch.hub.
Docs: Full documentation available.

Highlighted Details

Extensive library of reference implementations for over 30 influential sequence modeling papers.
Supports multiple generation strategies: beam search, diverse beam search, sampling, and lexically constrained decoding.
Integrates with xFormers for optimized Transformer performance.
Offers pre-trained models for translation and language modeling via torch.hub.
Adopted the Hydra configuration framework for flexible project setup.

Maintenance & Community

The project is actively maintained by Facebook AI Research. Updates are frequent, with recent additions including models for large-scale multilingual speech and direct speech-to-speech translation. Community engagement is facilitated via Twitter, a Facebook group, and a Google group.

Licensing & Compatibility

Fairseq is MIT-licensed, including its pre-trained models, allowing for commercial use and integration into closed-source projects.

Limitations & Caveats

While powerful, fairseq's extensive feature set and numerous examples can present a steep learning curve. Some advanced features or specific paper implementations might require careful configuration and understanding of the underlying research.

Health Check

Last Commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

84 stars in the last 30 days