PyTorch tool for pipeline parallelism
Top 46.0% on sourcepulse
PiPPy provides a compiler and runtime for automating pipeline parallelism in PyTorch models, targeting researchers and engineers scaling large deep learning models. It simplifies the implementation of pipeline parallelism, enabling efficient execution across multiple devices and hosts with minimal code modification.
How It Works
PiPPy automatically partitions a PyTorch model into stages by tracing its execution graph. It then transforms these stages into a Pipe
object, which defines the data flow between stages. A PipelineStage
runtime executes these stages concurrently, managing micro-batch splitting, inter-stage communication, and gradient synchronization. This approach allows for automatic handling of complex model topologies like skip connections and tied weights.
Quick Start & Requirements
pip install -r requirements.txt --find-links https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html
(or CUDA version).python setup.py install
or python setup.py develop
.Highlighted Details
annotate_split_points
or pipe_split()
API.Maintenance & Community
PiPPy has been migrated into PyTorch as torch.distributed.pipelining
. The original repository now serves as an examples land, and library code will be removed.
Licensing & Compatibility
PiPPy is licensed under the 3-clause BSD license, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
The original PiPPy repository's library code is slated for removal, with users directed to use the torch.distributed.pipelining
subpackage. The README primarily serves as an example repository.
11 months ago
Inactive