MatX  by NVIDIA

C++ library for GPU numerical computing with Python-like syntax

Created 4 years ago
1,351 stars

Top 29.7% on SourcePulse

GitHubView on GitHub
Project Summary

MatX is a C++17 library designed for high-performance numerical computing on NVIDIA GPUs and CPUs, targeting researchers and engineers who need efficient tensor operations with a Python-like syntax. It aims to provide near-native performance with reduced code complexity compared to lower-level CUDA programming or even GPU libraries like CuPy.

How It Works

MatX leverages optimized backend libraries and employs efficient kernel generation for custom operations. Its core design revolves around a C++ template-based tensor abstraction that allows for operator overloading and expression fusion. This enables the compiler to optimize complex sequences of operations, minimizing intermediate data movement and maximizing computational throughput. The library supports a wide range of data types, including half-precision and complex numbers, with specialized wrappers for seamless host and device execution.

Quick Start & Requirements

  • Installation: Header-only for application use; build tests/examples via CMake.
  • Prerequisites: CUDA 11.8 or 12.2.1+; GCC 9+, nvc++ 24.5, or Clang 17+; Linux OS. Supports Pascal to Hopper GPUs and Jetson (Jetpack 5.0+).
  • Resources: CMake fetches dependencies; compilation can be lengthy without parallelism.
  • Docs: Official Documentation
  • Quick Start: Quick Start Guide
  • Notebooks: Jupyter Notebooks

Highlighted Details

  • Achieves over 4x speedup compared to CuPy and 2100x over NumPy for FFT resamplers on an A100 GPU.
  • Supports Python-like syntax for tensor manipulation and operations.
  • Integrates easily with existing C++ projects via CMake.
  • Offers web-based visualization of GPU data.

Maintenance & Community

  • Active development by NVIDIA.
  • Discussions board available for user interaction.
  • Issue reporting guidelines provided with specific prefixes ([BUG], [DOC], [FEA], [QST]).
  • Contribution guide available in CONTRIBUTING.md.

Licensing & Compatibility

  • License: Apache 2.0.
  • Compatibility: Permissive license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

  • Linux-only support due to testing limitations; Windows support is community-driven.
  • CUDA 12.0.0-12.2.0 may cause build issues with unit tests.
  • Building documentation requires several external dependencies (Doxygen, Sphinx, etc.).
Health Check
Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
14
Issues (30d)
2
Star History
8 stars in the last 30 days

Explore Similar Projects

Starred by Matthijs Douze Matthijs Douze(Coauthor of Faiss; Research Scientist at Meta) and Xiaofan Luan Xiaofan Luan(VP Engineering at Zilliz).

knowhere by zilliztech

0.7%
278
Vector search engine for Milvus
Created 2 years ago
Updated 2 days ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), and
3 more.

gpu.cpp by AnswerDotAI

0%
4k
C++ library for portable GPU computation using WebGPU
Created 1 year ago
Updated 2 months ago
Starred by David Cournapeau David Cournapeau(Author of scikit-learn), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
5 more.

lectures by gpu-mode

0.8%
5k
Lecture series for GPU-accelerated computing
Created 1 year ago
Updated 4 days ago
Starred by Bojan Tunguz Bojan Tunguz(AI Scientist; Formerly at NVIDIA), Alex Chen Alex Chen(Cofounder of Nexa AI), and
19 more.

ggml by ggml-org

0.3%
13k
Tensor library for machine learning
Created 3 years ago
Updated 2 days ago
Feedback? Help us improve.