fairseq  by facebookresearch

Sequence modeling toolkit for translation, language modeling, and text generation research

Created 8 years ago
31,812 stars

Top 1.1% on SourcePulse

GitHubView on GitHub
Project Summary

Fairseq is a Python-based sequence modeling toolkit for researchers and developers, offering implementations of various state-of-the-art models for tasks like machine translation, summarization, and language modeling. It provides a flexible and extensible framework for training custom models and reproducing results from numerous influential research papers.

How It Works

Fairseq supports a wide range of architectures including CNNs, LSTMs, and various Transformer variants (e.g., Transformer-XL, Linformer, NormFormer). It leverages PyTorch for its backend and offers advanced training features like multi-GPU/multi-node parallelism, gradient accumulation, mixed-precision training, and parameter/optimizer state sharding for efficient handling of large models and datasets.

Quick Start & Requirements

  • Install: pip install --editable ./ (local development) or pip install fairseq (stable release).
  • Prerequisites: PyTorch >= 1.10.0, Python >= 3.8. NVIDIA GPU and NCCL required for training. Apex library recommended for faster training.
  • Setup: Local installation is straightforward. Pre-trained models are available via torch.hub.
  • Docs: Full documentation available.

Highlighted Details

  • Extensive library of reference implementations for over 30 influential sequence modeling papers.
  • Supports multiple generation strategies: beam search, diverse beam search, sampling, and lexically constrained decoding.
  • Integrates with xFormers for optimized Transformer performance.
  • Offers pre-trained models for translation and language modeling via torch.hub.
  • Adopted the Hydra configuration framework for flexible project setup.

Maintenance & Community

The project is actively maintained by Facebook AI Research. Updates are frequent, with recent additions including models for large-scale multilingual speech and direct speech-to-speech translation. Community engagement is facilitated via Twitter, a Facebook group, and a Google group.

Licensing & Compatibility

Fairseq is MIT-licensed, including its pre-trained models, allowing for commercial use and integration into closed-source projects.

Limitations & Caveats

While powerful, fairseq's extensive feature set and numerous examples can present a steep learning curve. Some advanced features or specific paper implementations might require careful configuration and understanding of the underlying research.

Health Check
Last Commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
2
Star History
127 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.