metaseq  by facebookresearch

Codebase for large-scale transformer model development and deployment

Created 3 years ago
6,547 stars

Top 7.8% on SourcePulse

GitHubView on GitHub
Project Summary

Metaseq is a large-scale codebase for Open Pre-trained Transformers (OPT), forked from fairseq. It serves researchers and engineers working with massive language models, providing a flexible platform with numerous integrations for efficient training, fine-tuning, and inference across various hardware and software ecosystems.

How It Works

Metaseq extends the fairseq framework to facilitate large-scale OPT model development. Its core advantage lies in its extensive integrations with popular AI/ML libraries such as Hugging Face Transformers, Alpa, Colossal-AI, CTranslate2, FasterTransformer, and DeepSpeed. This allows users to leverage diverse optimization techniques, including 8-bit quantization via SmoothQuant (with CTranslate2) and optimized inference engines (FasterTransformer), catering to a wide range of deployment scenarios and hardware capabilities.

Quick Start & Requirements

Setup instructions are referenced as available via a link in the repository, though not directly provided in this README excerpt. Integrations suggest potential requirements for specific hardware (e.g., GPUs) and software environments (e.g., CUDA for FasterTransformer).

Highlighted Details

  • Broad ecosystem support: Integrates with Hugging Face Transformers, Alpa, Colossal-AI, CTranslate2, FasterTransformer, and DeepSpeed for flexible model deployment and optimization.
  • Scalability: Supports OPT models ranging from 125 million to 175 billion parameters.
  • Inference Optimization: Features integration with CTranslate2 for efficient inference, including 8-bit quantization using SmoothQuant, and FasterTransformer for high-performance serving.
  • Training Flexibility: Enables fine-tuning using frameworks like DeepSpeed.

Maintenance & Community

The project is actively maintained by a dedicated team listed under CODEOWNERS. Community support and bug reporting are handled through the GitHub Issues page, with contribution guidelines available in a separate document.

Licensing & Compatibility

The majority of metaseq is licensed under the permissive MIT license. However, specific components, notably Megatron-LM, are subject to separate license terms (Megatron-LM license). Users must carefully review these dual licensing conditions, particularly for commercial applications or derivative works.

Limitations & Caveats

The primary caveat is the mixed licensing, requiring careful attention to the terms of the Megatron-LM license for certain project portions. Specific setup details or hardware/software prerequisites beyond general framework integrations are not detailed in this excerpt.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Wei-Lin Chiang Wei-Lin Chiang(Cofounder of LMArena), and
13 more.

awesome-tensor-compilers by merrymercy

0.4%
3k
Curated list of tensor compiler projects and papers
Created 5 years ago
Updated 1 year ago
Starred by Shengjia Zhao Shengjia Zhao(Chief Scientist at Meta Superintelligence Lab), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
14 more.

BIG-bench by google

0.2%
3k
Collaborative benchmark for probing and extrapolating LLM capabilities
Created 4 years ago
Updated 1 year ago
Starred by Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
14 more.

simpletransformers by ThilinaRajapakse

0.0%
4k
Rapid NLP task implementation
Created 6 years ago
Updated 3 months ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
16 more.

text-to-text-transfer-transformer by google-research

0.1%
6k
Unified text-to-text transformer for NLP research
Created 6 years ago
Updated 3 weeks ago
Starred by Vaibhav Nivargi Vaibhav Nivargi(Cofounder of Moveworks), Chuan Li Chuan Li(Chief Scientific Officer at Lambda), and
5 more.

awesome-mlops by visenger

0.1%
13k
Curated MLOps knowledge hub
Created 5 years ago
Updated 1 year ago
Feedback? Help us improve.