ModelCenter by OpenBMB

Transformer library for efficient, low-resource, distributed training

Created 3 years ago

264 stars

Top 96.8% on SourcePulse

Project Summary

ModelCenter provides efficient, low-resource, and extendable implementations of large pre-trained language models (PLMs) for distributed training. It targets researchers and engineers working with large transformer models, offering a more memory-efficient and user-friendly alternative to frameworks like DeepSpeed and Megatron.

How It Works

ModelCenter leverages the OpenBMB/BMTrain backend, which integrates ZeRO optimization for efficient distributed training. This approach significantly reduces memory footprints, enabling larger batch sizes and better GPU utilization. The framework is designed for PyTorch-style coding, aiming for easier configuration and a more uniform development experience compared to other distributed training solutions.

Quick Start & Requirements

Installation: pip install model-center or from source.
Prerequisites: PyTorch, Python. Distributed training requires torch.distributed or torchrun.
Documentation: https://github.com/OpenBMB/ModelCenter

Highlighted Details

Supports a wide range of popular PLMs including BERT, RoBERTa, T5, GPT-2, GPT-J, Longformer, GLM, ViT, and LLaMA.
Features efficient memory utilization, reducing memory footprint by several times.
Optimized for low-resource distributed training with ZeRO optimization.
Includes implementations for beam search generation for models like T5 and LLaMA.

Maintenance & Community

Active development with regular updates (e.g., LLaMA support added May 2023).
Community channels include QQ Group (735930538), Website (https://www.openbmb.org), Twitter (https://twitter.com/OpenBMB).

Licensing & Compatibility

License: Apache 2.0 License.
Compatibility: Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

The project is built upon BMTrain, and its performance and feature set are closely tied to that dependency. While it supports many models, specific model implementations or advanced features might still be under active development.

ModelCenter by OpenBMB

Explore Similar Projects

MegaDLMs by JinjieNi

libai by Oneflow-Inc

PatrickStar by Tencent

InternEvo by InternLM

MINI_LLM by jiahe7ay

gpt-2-tensorflow2.0 by akanyaani

BMTrain by OpenBMB

tiny-llm-zh by wdndev

Megatron-DeepSpeed by bigscience-workshop

apex by NVIDIA

pytorch-lightning by Lightning-AI

ColossalAI by hpcaitech