metaseq by facebookresearch

Codebase for large-scale transformer model development and deployment

Created 3 years ago

6,547 stars

Top 7.8% on SourcePulse

View on GitHub

16 Experts Love This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Georgios Konstantopoulos

CTO, General Partner at Paradigm

and 12 more!

Project Summary

Metaseq is a large-scale codebase for Open Pre-trained Transformers (OPT), forked from fairseq. It serves researchers and engineers working with massive language models, providing a flexible platform with numerous integrations for efficient training, fine-tuning, and inference across various hardware and software ecosystems.

How It Works

Metaseq extends the fairseq framework to facilitate large-scale OPT model development. Its core advantage lies in its extensive integrations with popular AI/ML libraries such as Hugging Face Transformers, Alpa, Colossal-AI, CTranslate2, FasterTransformer, and DeepSpeed. This allows users to leverage diverse optimization techniques, including 8-bit quantization via SmoothQuant (with CTranslate2) and optimized inference engines (FasterTransformer), catering to a wide range of deployment scenarios and hardware capabilities.

Quick Start & Requirements

Setup instructions are referenced as available via a link in the repository, though not directly provided in this README excerpt. Integrations suggest potential requirements for specific hardware (e.g., GPUs) and software environments (e.g., CUDA for FasterTransformer).

Highlighted Details

Broad ecosystem support: Integrates with Hugging Face Transformers, Alpa, Colossal-AI, CTranslate2, FasterTransformer, and DeepSpeed for flexible model deployment and optimization.
Scalability: Supports OPT models ranging from 125 million to 175 billion parameters.
Inference Optimization: Features integration with CTranslate2 for efficient inference, including 8-bit quantization using SmoothQuant, and FasterTransformer for high-performance serving.
Training Flexibility: Enables fine-tuning using frameworks like DeepSpeed.

Maintenance & Community

The project is actively maintained by a dedicated team listed under CODEOWNERS. Community support and bug reporting are handled through the GitHub Issues page, with contribution guidelines available in a separate document.

Licensing & Compatibility

The majority of metaseq is licensed under the permissive MIT license. However, specific components, notably Megatron-LM, are subject to separate license terms (Megatron-LM license). Users must carefully review these dual licensing conditions, particularly for commercial applications or derivative works.

Limitations & Caveats

The primary caveat is the mixed licensing, requiring careful attention to the terms of the Megatron-LM license for certain project portions. Specific setup details or hardware/software prerequisites beyond general framework integrations are not detailed in this excerpt.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

8 stars in the last 30 days