Discover and explore top open-source AI tools and projects—updated daily.
ROCmROCm library for GPU collective communication routines
Top 72.6% on SourcePulse
The ROCm Communication Collectives Library (RCCL) provides optimized collective communication routines for GPUs, targeting researchers and developers building large-scale AI and HPC applications. It enables efficient inter-GPU communication across multiple nodes, aiming to maximize bandwidth and minimize latency.
How It Works
RCCL implements standard collective operations like all-reduce, broadcast, and all-gather using ring and tree algorithms. It is optimized for various interconnects (PCIe, xGMI, InfiniBand, TCP/IP) and supports arbitrary numbers of GPUs in single or multi-node, multi-process applications. For performance, small operations can be batched or aggregated via the API.
Quick Start & Requirements
install.sh script (./install.sh) or build manually with CMake.install.sh offers options for quick builds, debugging, and targeting specific GPU architectures. Manual build requires cmake .. && make -j <jobs>.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
install.sh script simplifies initial setup.15 hours ago
1 week
NousResearch
uccl-project
microsoft
ByteDance-Seed
ztxz16