Lightweight training framework for model pre-training
Top 73.2% on sourcepulse
InternEvo is a lightweight training framework designed for efficient large language model pre-training and fine-tuning, supporting massive clusters and single-GPU setups. It aims to simplify the training process without extensive dependencies, enabling users to achieve high performance and accelerate training on large-scale hardware.
How It Works
InternEvo employs a modular design that integrates various parallelization strategies, including Data Parallelism, Tensor Parallelism (MTP), Pipeline Parallelism, Sequence Parallelism (FSP, ISP), and ZeRO optimization. This multi-faceted approach allows for efficient scaling across thousands of GPUs and optimized memory usage, contributing to its reported high acceleration efficiency. The framework also supports streaming datasets and integrates with libraries like Flash-Attention for further performance gains.
Quick Start & Requirements
pip install InternEvo
torch==2.1.0+cu118
), torchvision
, torchaudio
, torch-scatter
. Optional: flash-attn==2.2.1
for acceleration.torchrun
distributed execution.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The framework requires specific PyTorch versions and CUDA versions for optimal performance, and Flash-Attention installation is conditional on environment support. Detailed configuration for diverse hardware setups might require consulting the extensive documentation.
1 week ago
1 week