ggml  by ggml-org

Tensor library for machine learning

created 2 years ago
12,898 stars

Top 3.9% on sourcepulse

GitHubView on GitHub
Project Summary

ggml is a C tensor library designed for machine learning, focusing on efficient execution across diverse hardware. It targets developers and researchers needing a low-level, dependency-free tensor computation engine for inference and training, particularly on resource-constrained devices. The library's key benefit is its portability and performance through features like integer quantization and broad hardware acceleration.

How It Works

ggml employs a C-based, low-level implementation for maximum portability and minimal overhead. It supports integer quantization to reduce model size and memory bandwidth, enabling faster inference on CPUs and GPUs. The library handles automatic differentiation and includes optimizers like ADAM and L-BFGS, facilitating both inference and training workflows. A notable design choice is its commitment to zero memory allocations during runtime, contributing to predictable performance.

Quick Start & Requirements

  • Install: Clone the repository, set up a Python virtual environment, and install dependencies with pip install -r requirements.txt.
  • Build: Use CMake: mkdir build && cd build && cmake .. && cmake --build . --config Release -j 8.
  • Prerequisites: C++ compiler, CMake, Python 3.10+. Optional: CUDA 12.1+ for GPU acceleration, hipBLAS for AMD GPUs, Intel oneAPI for SYCL. Android development requires the NDK.
  • Resources: Introduction to ggml, GGUF file format.

Highlighted Details

  • Broad hardware support including CPU, CUDA, hipBLAS, and SYCL.
  • Integer quantization support for efficient inference.
  • Automatic differentiation and optimizers (ADAM, L-BFGS).
  • Zero memory allocations during runtime.

Maintenance & Community

Development is active, with significant contributions seen in related projects like llama.cpp and whisper.cpp.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Users should verify licensing for commercial or closed-source use.

Limitations & Caveats

The project is under active development, implying potential for breaking changes. Specific hardware acceleration configurations (CUDA, hipBLAS, SYCL) require careful setup and may have version dependencies.

Health Check
Last commit

3 days ago

Responsiveness

1 day

Pull Requests (30d)
23
Issues (30d)
7
Star History
520 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jaret Burkett Jaret Burkett(Founder of Ostris), and
1 more.

nunchaku by nunchaku-tech

2.1%
3k
High-performance 4-bit diffusion model inference engine
created 8 months ago
updated 14 hours ago
Feedback? Help us improve.