Distributed training framework for TF, Keras, PyTorch, and MXNet
Top 3.5% on sourcepulse
Horovod is a distributed deep learning training framework designed to simplify and accelerate the scaling of training workloads across multiple GPUs and nodes. It targets researchers and engineers working with TensorFlow, Keras, PyTorch, and Apache MXNet, enabling them to leverage distributed computing with minimal code changes and achieve significant performance gains.
How It Works
Horovod utilizes a ring-based AllReduce algorithm, inspired by Message Passing Interface (MPI) concepts, to efficiently synchronize gradients across workers. This approach minimizes communication overhead by interleaving gradient computation with communication and supports tensor fusion to batch small AllReduce operations, further boosting performance. It requires minimal code modifications to existing single-GPU training scripts.
Quick Start & Requirements
pip install horovod
HOROVOD_GPU_OPERATIONS=NCCL pip install horovod
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 day ago
1 day