metaseq  by facebookresearch

Codebase for large-scale transformer model development and deployment

Created 3 years ago
6,546 stars

Top 7.8% on SourcePulse

GitHubView on GitHub
Project Summary

Metaseq is a large-scale codebase for Open Pre-trained Transformers (OPT), forked from fairseq. It serves researchers and engineers working with massive language models, providing a flexible platform with numerous integrations for efficient training, fine-tuning, and inference across various hardware and software ecosystems.

How It Works

Metaseq extends the fairseq framework to facilitate large-scale OPT model development. Its core advantage lies in its extensive integrations with popular AI/ML libraries such as Hugging Face Transformers, Alpa, Colossal-AI, CTranslate2, FasterTransformer, and DeepSpeed. This allows users to leverage diverse optimization techniques, including 8-bit quantization via SmoothQuant (with CTranslate2) and optimized inference engines (FasterTransformer), catering to a wide range of deployment scenarios and hardware capabilities.

Quick Start & Requirements

Setup instructions are referenced as available via a link in the repository, though not directly provided in this README excerpt. Integrations suggest potential requirements for specific hardware (e.g., GPUs) and software environments (e.g., CUDA for FasterTransformer).

Highlighted Details

  • Broad ecosystem support: Integrates with Hugging Face Transformers, Alpa, Colossal-AI, CTranslate2, FasterTransformer, and DeepSpeed for flexible model deployment and optimization.
  • Scalability: Supports OPT models ranging from 125 million to 175 billion parameters.
  • Inference Optimization: Features integration with CTranslate2 for efficient inference, including 8-bit quantization using SmoothQuant, and FasterTransformer for high-performance serving.
  • Training Flexibility: Enables fine-tuning using frameworks like DeepSpeed.

Maintenance & Community

The project is actively maintained by a dedicated team listed under CODEOWNERS. Community support and bug reporting are handled through the GitHub Issues page, with contribution guidelines available in a separate document.

Licensing & Compatibility

The majority of metaseq is licensed under the permissive MIT license. However, specific components, notably Megatron-LM, are subject to separate license terms (Megatron-LM license). Users must carefully review these dual licensing conditions, particularly for commercial applications or derivative works.

Limitations & Caveats

The primary caveat is the mixed licensing, requiring careful attention to the terms of the Megatron-LM license for certain project portions. Specific setup details or hardware/software prerequisites beyond general framework integrations are not detailed in this excerpt.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Shengjia Zhao Shengjia Zhao(Chief Scientist at Meta Superintelligence Lab), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
14 more.

BIG-bench by google

0.1%
3k
Collaborative benchmark for probing and extrapolating LLM capabilities
Created 4 years ago
Updated 1 year ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
16 more.

text-to-text-transfer-transformer by google-research

0.1%
6k
Unified text-to-text transformer for NLP research
Created 6 years ago
Updated 5 months ago
Feedback? Help us improve.