raft  by rapidsai

CUDA-accelerated primitives for ML/data mining algorithms

Created 6 years ago
970 stars

Top 38.0% on SourcePulse

GitHubView on GitHub
Project Summary

RAFT provides CUDA-accelerated C++ primitives and Python bindings for high-performance machine learning and data mining tasks. It serves as a foundational library for building accelerated applications, particularly for data scientists and application developers needing low-level GPU-optimized computations.

How It Works

RAFT employs a header-only C++ template library approach, allowing for flexible integration and compile-time optimizations. It offers an optional shared library for host-accessible runtime APIs, simplifying usage without requiring a CUDA compiler. The library leverages RAPIDS Memory Manager (RMM) for efficient memory handling and mdspan/mdarray for multi-dimensional data representation, enabling seamless interoperability with other GPU-accelerated libraries.

Quick Start & Requirements

  • Install: Conda is the recommended installation method:
    # For CUDA 12.5
    mamba install -c rapidsai -c conda-forge -c nvidia raft-dask pylibraft cuda-version=12.8
    
    Pip installation for Python libraries is also available:
    pip install pylibraft-cu11 --extra-index-url=https://pypi.nvidia.com
    
  • Prerequisites: CUDA Toolkit (version specified during installation), Conda or Mamba.
  • Resources: Installation via Conda is generally quick. C++ compilation from source may require significant build time.
  • Docs: RAFT Reference Documentation, Getting Started

Highlighted Details

  • Vector Search Migration: Vector search and clustering algorithms are migrating to the cuVS library. RAFT headers for these will be removed after the December 2024 release.
  • Interoperability: pylibraft outputs are compatible with libraries supporting __cuda_array_interface__ (CuPy, PyTorch, JAX, TensorFlow) and DLPack, enabling zero-copy conversions.
  • Data Structures: Utilizes mdspan and mdarray for efficient multi-dimensional array handling on host and device.
  • Core Components: Includes primitives for linear algebra, sparse/dense operations, solvers, statistics, and utilities.

Maintenance & Community

RAFT is part of the NVIDIA RAPIDS ecosystem. Community support is available via RAPIDS Community.

Licensing & Compatibility

RAFT is released under the Apache 2.0 License, permitting commercial use and linking with closed-source projects.

Limitations & Caveats

The project explicitly states that vector search and clustering algorithms are being deprecated in RAFT in favor of the cuVS library, with headers to be removed in a future release. Users relying on these specific RAFT components should migrate to cuVS.

Health Check
Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
21
Issues (30d)
10
Star History
11 stars in the last 30 days

Explore Similar Projects

Starred by David Cournapeau David Cournapeau(Author of scikit-learn), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
5 more.

lectures by gpu-mode

0.8%
6k
Lecture series for GPU-accelerated computing
Created 2 years ago
Updated 1 month ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Eric Zhang Eric Zhang(Founding Engineer at Modal), and
9 more.

DeepGEMM by deepseek-ai

0.4%
6k
CUDA library for efficient FP8 GEMM kernels with fine-grained scaling
Created 11 months ago
Updated 5 days ago
Starred by Bojan Tunguz Bojan Tunguz(AI Scientist; Formerly at NVIDIA), Alex Chen Alex Chen(Cofounder of Nexa AI), and
19 more.

ggml by ggml-org

0.2%
14k
Tensor library for machine learning
Created 3 years ago
Updated 15 hours ago
Starred by Tri Dao Tri Dao(Chief Scientist at Together AI), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
23 more.

cutlass by NVIDIA

0.5%
9k
CUDA C++ and Python DSLs for high-performance linear algebra
Created 8 years ago
Updated 2 days ago
Feedback? Help us improve.