GPU collective communication library for ML workloads
Top 63.5% on SourcePulse
UCCL is an open-source collective communication library designed to enhance GPU communication performance for machine learning workloads, offering a drop-in replacement for NCCL/RCCL. It targets researchers and practitioners seeking higher latency and throughput, particularly in heterogeneous GPU and networking environments.
How It Works
UCCL re-architects the communication layer to maximize hardware potential, featuring a custom software transport layer that employs packet spraying across numerous network paths to avoid congestion. This approach, combined with advanced congestion control and efficient loss recovery, aims to outperform traditional single-path transports like kernel TCP and RDMA.
Quick Start & Requirements
git clone
and bash build_and_install.sh [cuda|rocm]
.NCCL_NET_PLUGIN
and LD_PRELOAD
environment variables to point to UCCL plugins for specific network configurations (IB/RoCE, AWS EFA).Highlighted Details
Maintenance & Community
Actively developed at UC Berkeley Sky Computing Lab and UC Davis ArtSy lab. Supported by AMD, AWS, Broadcom, CloudLab, Google Cloud, IBM, Lambda, and Mibura. Community engagement via GitHub issues.
Licensing & Compatibility
The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project is under active development, with features like dynamic membership and improved KV cache transfer still pending. The absence of a specified license may pose adoption challenges for commercial applications.
1 day ago
Inactive