fairseq2  by facebookresearch

Sequence modeling toolkit for content generation research

Created 2 years ago
1,040 stars

Top 36.1% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

fairseq2 is a modular and extensible toolkit for training custom sequence modeling models, primarily for content generation tasks. It is designed for researchers and engineers working on advanced AI projects, offering a clean API and supporting large-scale, multi-GPU/multi-node training.

How It Works

fairseq2 is a complete rewrite of the original fairseq, adopting a less intrusive, extensible architecture. It leverages modern PyTorch features like torch.compile and FSDP, and includes a C++ based streaming data pipeline for high throughput. Its extensibility is managed via a setuptools extension mechanism, allowing easy registration of new components without forking.

Quick Start & Requirements

  • Install: pip install fairseq2 (ensure PyTorch is installed first, matching the fairseq2 variant).
  • Dependencies: libsndfile (install via system package manager or Homebrew).
  • Variants: Pre-built packages require matching PyTorch and CUDA versions. See the Variants table for combinations.
  • Documentation: Stable, Nightly

Highlighted Details

  • Supports instruction finetuning and preference optimization (DPO, CPO, SimPO, ORPO).
  • Enables multi-GPU/multi-node training with DDP, FSDP, and tensor parallelism for models >70B parameters.
  • Native vLLM integration with built-in sampling and beam search.
  • Supports LLaMA 1-3.3, Mistral 7B, NLLB-200, and various speech/vision models.

Maintenance & Community

Developed by Meta AI (FAIR). Contribution guidelines are available.

Licensing & Compatibility

MIT licensed. Compatible with commercial use.

Limitations & Caveats

No native Windows support; WSL2 is recommended. Pre-built packages require strict PyTorch/CUDA version matching due to C++ API compatibility issues. ARM64 macOS requires building from source for non-PyPI variants.

Health Check
Last Commit

1 day ago

Responsiveness

1 week

Pull Requests (30d)
48
Issues (30d)
33
Star History
11 stars in the last 30 days

Explore Similar Projects

Starred by Chris Lattner Chris Lattner(Author of LLVM, Clang, Swift, Mojo, MLIR; Cofounder of Modular), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
18 more.

open-infra-index by deepseek-ai

0.1%
8k
AI infrastructure tools for efficient AGI development
Created 6 months ago
Updated 4 months ago
Feedback? Help us improve.