cccl  by NVIDIA

CUDA C++ building blocks for high-performance GPU computing

Created 5 years ago
2,266 stars

Top 19.5% on SourcePulse

GitHubView on GitHub
Project Summary

CUDA Core Compute Libraries (CCCL) unifies essential CUDA C++ libraries—Thrust, CUB, and libcudacxx—into a single, header-only repository. It targets CUDA C++ developers seeking to simplify the creation of safe, efficient, and high-performance GPU code, offering a streamlined development process and broader leverage of CUDA capabilities.

How It Works

CCCL integrates Thrust's high-level parallel algorithms, CUB's low-level CUDA-specific primitives, and libcudacxx's CUDA C++ Standard Library implementation. This unification provides developers with a cohesive set of building blocks, enhancing productivity and enabling performance portability across GPUs and CPUs via configurable backends. The approach leverages existing, well-regarded libraries into a single, managed entity.

Quick Start & Requirements

CCCL is header-only. Integration is straightforward:

  • CUDA Toolkit: Headers are automatically included when compiling with nvcc.
  • GitHub: Clone the repository and include headers via -I flags (e.g., nvcc -Icccl/thrust ... main.cu).
  • Conda: Install via conda install cccl or conda install cuda-cccl from conda-forge.
  • CMake: Standard cmake . && make install or cmake --preset install workflows are supported.

Requirements: CUDA Toolkit, compatible host compiler (e.g., GCC >= 7.x on Linux), C++17/C++20. GPU architectures supported by the CUDA Toolkit. Links: Documentation, Live Demo (Godbolt), GitHub Repository.

Highlighted Details

  • Header-only distribution simplifies integration.
  • Unified API across Thrust, CUB, and libcudacxx.
  • Supports C++17 and C++20 standards.
  • Backward compatible with current and preceding CUDA Toolkit major versions.

Maintenance & Community

The project is maintained by NVIDIA. A Discord server is available for community interaction, and a Contributor Guide outlines how to contribute to development.

Licensing & Compatibility

The specific license is not detailed in the provided README excerpt. CCCL is generally compatible with all operating systems and host compilers supported by the CUDA Toolkit. It maintains backward compatibility with CUDA Toolkit versions but is not forward compatible.

Limitations & Caveats

ABI stability is not guaranteed for thrust:: and cub:: namespaces. While cuda:: namespace ABI changes are versioned, users should recompile binaries if ABI breaks occur. Only the latest CCCL version is supported; fixes are not backported. New features may require newer CUDA Toolkit versions.

Health Check
Last Commit

23 hours ago

Responsiveness

Inactive

Pull Requests (30d)
274
Issues (30d)
100
Star History
64 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Ying Sheng Ying Sheng(Coauthor of SGLang).

fastllm by ztxz16

0.1%
4k
High-performance C++ LLM inference library
Created 2 years ago
Updated 2 days ago
Starred by David Cournapeau David Cournapeau(Author of scikit-learn), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
5 more.

lectures by gpu-mode

0.3%
6k
Lecture series for GPU-accelerated computing
Created 2 years ago
Updated 2 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Eric Zhang Eric Zhang(Founding Engineer at Modal), and
9 more.

DeepGEMM by deepseek-ai

0.2%
6k
CUDA library for efficient FP8 GEMM kernels with fine-grained scaling
Created 1 year ago
Updated 2 weeks ago
Feedback? Help us improve.