fairseq2  by facebookresearch

Sequence modeling toolkit for content generation research

Created 2 years ago
1,044 stars

Top 36.0% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

fairseq2 is a modular and extensible toolkit for training custom sequence modeling models, primarily for content generation tasks. It is designed for researchers and engineers working on advanced AI projects, offering a clean API and supporting large-scale, multi-GPU/multi-node training.

How It Works

fairseq2 is a complete rewrite of the original fairseq, adopting a less intrusive, extensible architecture. It leverages modern PyTorch features like torch.compile and FSDP, and includes a C++ based streaming data pipeline for high throughput. Its extensibility is managed via a setuptools extension mechanism, allowing easy registration of new components without forking.

Quick Start & Requirements

  • Install: pip install fairseq2 (ensure PyTorch is installed first, matching the fairseq2 variant).
  • Dependencies: libsndfile (install via system package manager or Homebrew).
  • Variants: Pre-built packages require matching PyTorch and CUDA versions. See the Variants table for combinations.
  • Documentation: Stable, Nightly

Highlighted Details

  • Supports instruction finetuning and preference optimization (DPO, CPO, SimPO, ORPO).
  • Enables multi-GPU/multi-node training with DDP, FSDP, and tensor parallelism for models >70B parameters.
  • Native vLLM integration with built-in sampling and beam search.
  • Supports LLaMA 1-3.3, Mistral 7B, NLLB-200, and various speech/vision models.

Maintenance & Community

Developed by Meta AI (FAIR). Contribution guidelines are available.

Licensing & Compatibility

MIT licensed. Compatible with commercial use.

Limitations & Caveats

No native Windows support; WSL2 is recommended. Pre-built packages require strict PyTorch/CUDA version matching due to C++ API compatibility issues. ARM64 macOS requires building from source for non-PyPI variants.

Health Check
Last Commit

16 hours ago

Responsiveness

1 week

Pull Requests (30d)
39
Issues (30d)
25
Star History
6 stars in the last 30 days

Explore Similar Projects

Starred by Chris Lattner Chris Lattner(Author of LLVM, Clang, Swift, Mojo, MLIR; Cofounder of Modular), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
18 more.

open-infra-index by deepseek-ai

0.1%
8k
AI infrastructure tools for efficient AGI development
Created 8 months ago
Updated 5 months ago
Feedback? Help us improve.