fairseq2  by facebookresearch

Sequence modeling toolkit for content generation research

created 2 years ago
1,020 stars

Top 37.3% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

fairseq2 is a modular and extensible toolkit for training custom sequence modeling models, primarily for content generation tasks. It is designed for researchers and engineers working on advanced AI projects, offering a clean API and supporting large-scale, multi-GPU/multi-node training.

How It Works

fairseq2 is a complete rewrite of the original fairseq, adopting a less intrusive, extensible architecture. It leverages modern PyTorch features like torch.compile and FSDP, and includes a C++ based streaming data pipeline for high throughput. Its extensibility is managed via a setuptools extension mechanism, allowing easy registration of new components without forking.

Quick Start & Requirements

  • Install: pip install fairseq2 (ensure PyTorch is installed first, matching the fairseq2 variant).
  • Dependencies: libsndfile (install via system package manager or Homebrew).
  • Variants: Pre-built packages require matching PyTorch and CUDA versions. See the Variants table for combinations.
  • Documentation: Stable, Nightly

Highlighted Details

  • Supports instruction finetuning and preference optimization (DPO, CPO, SimPO, ORPO).
  • Enables multi-GPU/multi-node training with DDP, FSDP, and tensor parallelism for models >70B parameters.
  • Native vLLM integration with built-in sampling and beam search.
  • Supports LLaMA 1-3.3, Mistral 7B, NLLB-200, and various speech/vision models.

Maintenance & Community

Developed by Meta AI (FAIR). Contribution guidelines are available.

Licensing & Compatibility

MIT licensed. Compatible with commercial use.

Limitations & Caveats

No native Windows support; WSL2 is recommended. Pre-built packages require strict PyTorch/CUDA version matching due to C++ API compatibility issues. ARM64 macOS requires building from source for non-PyPI variants.

Health Check
Last commit

1 day ago

Responsiveness

1 week

Pull Requests (30d)
18
Issues (30d)
3
Star History
126 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Michael Han Michael Han(Cofounder of Unsloth), and
1 more.

ktransformers by kvcache-ai

0.4%
15k
Framework for LLM inference optimization experimentation
created 1 year ago
updated 3 days ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
12 more.

DeepSpeed by deepspeedai

0.2%
40k
Deep learning optimization library for distributed training and inference
created 5 years ago
updated 1 day ago
Feedback? Help us improve.