FlagScale by flagos-ai

Large model toolkit for end-to-end management and scaling

Created 2 years ago

463 stars

Top 65.4% on SourcePulse

Project Summary

FlagScale is a comprehensive toolkit designed to streamline the entire lifecycle of large language models, from development to deployment. It targets researchers and engineers working with large models, offering a unified platform to maximize computational efficiency and enhance model performance across diverse hardware architectures.

How It Works

FlagScale integrates and extends popular open-source projects like Megatron-LM and vllm, providing a flexible, multi-backend mechanism. It supports heterogeneous parallelism, enabling training and inference across different chip architectures (e.g., NVIDIA, Iluvatar) within a single instance. This approach aims to simplify complex distributed setups and unlock performance gains by leveraging specialized hardware.

Quick Start & Requirements

Installation: Clone the repository, then run ./install-requirements.sh --env train and ./install-requirements.sh --env inference to set up conda environments. Custom extensions for vllm and Megatron-Energon may require additional pip install commands.
Prerequisites: NGC's PyTorch container is recommended. Specific model training/serving may require datasets in Megatron-LM format.
Configuration: Uses Hydra for configuration management with experiment-level and task-level YAML files.
Running Tasks: A unified runner (python run.py) handles training, inference, and serving via configuration files.
CLI: pip install . installs a CLI for one-click deployment (e.g., flagscale serve deepseek_r1).
Documentation: Refer to the Quick Start section in the README for detailed instructions.

Highlighted Details

Supports heterogeneous pre-training and decoding across different chips within a single instance (FlagCX beta).
Achieved State-of-the-Art (SOTA) results on the Infinity-MM dataset with LLaVA-OneVision.
Accelerated generation and understanding tasks for Emu3 via optimized CFG implementation.
Demonstrated heterogeneous hybrid training of Aquila2-70B-Expr across NVIDIA and Iluvatar chips.

Maintenance & Community

Developed with backing from the Beijing Academy of Artificial Intelligence (BAAI) as part of the FlagAI-Open initiative.
Recent updates (v0.8.0, v0.6.5, v0.6.0) show active development with new features and vendor adaptations.

Licensing & Compatibility

Licensed under the Apache License (Version 2.0).
Contains third-party components under other open-source licenses; refer to the LICENSE file for details.

Limitations & Caveats

Some features like heterogeneous prefill-decoding disaggregation and DeepSeek-v3 distributed pre-training are noted as beta.
Patching and unpatching backend code is required for full integration, indicating potential complexity in setup and maintenance.

Health Check

Last Commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)

106

Issues (30d)

Star History

41 stars in the last 30 days