ModelCenter  by OpenBMB

Transformer library for efficient, low-resource, distributed training

Created 3 years ago
262 stars

Top 97.3% on SourcePulse

GitHubView on GitHub
Project Summary

ModelCenter provides efficient, low-resource, and extendable implementations of large pre-trained language models (PLMs) for distributed training. It targets researchers and engineers working with large transformer models, offering a more memory-efficient and user-friendly alternative to frameworks like DeepSpeed and Megatron.

How It Works

ModelCenter leverages the OpenBMB/BMTrain backend, which integrates ZeRO optimization for efficient distributed training. This approach significantly reduces memory footprints, enabling larger batch sizes and better GPU utilization. The framework is designed for PyTorch-style coding, aiming for easier configuration and a more uniform development experience compared to other distributed training solutions.

Quick Start & Requirements

  • Installation: pip install model-center or from source.
  • Prerequisites: PyTorch, Python. Distributed training requires torch.distributed or torchrun.
  • Documentation: https://github.com/OpenBMB/ModelCenter

Highlighted Details

  • Supports a wide range of popular PLMs including BERT, RoBERTa, T5, GPT-2, GPT-J, Longformer, GLM, ViT, and LLaMA.
  • Features efficient memory utilization, reducing memory footprint by several times.
  • Optimized for low-resource distributed training with ZeRO optimization.
  • Includes implementations for beam search generation for models like T5 and LLaMA.

Maintenance & Community

Licensing & Compatibility

  • License: Apache 2.0 License.
  • Compatibility: Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

The project is built upon BMTrain, and its performance and feature set are closely tied to that dependency. While it supports many models, specific model implementations or advanced features might still be under active development.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

InternEvo by InternLM

0.2%
407
Lightweight training framework for model pre-training
Created 1 year ago
Updated 4 weeks ago
Starred by Clement Delangue Clement Delangue(Cofounder of Hugging Face), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
20 more.

accelerate by huggingface

0.3%
9k
PyTorch training helper for distributed execution
Created 4 years ago
Updated 1 day ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Li Jiang Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), and
26 more.

ColossalAI by hpcaitech

0.1%
41k
AI system for large-scale parallel training
Created 3 years ago
Updated 15 hours ago
Feedback? Help us improve.