go-llama.cpp by go-skynet

Go bindings for llama.cpp

Created 2 years ago

858 stars

Top 41.8% on SourcePulse

View on GitHub

2 Experts Love This Project

Project Summary

This Go library provides high-level bindings for llama.cpp, enabling Go developers to integrate large language models into their applications. It targets Go developers seeking efficient LLM inference without the complexity of direct C++ interaction, leveraging llama.cpp's performance optimizations.

How It Works

The project maintains a low-level C/C++ interface within the Go bindings to minimize overhead and maximize performance. This approach keeps most of the heavy lifting in the native llama.cpp code, simplifying the Go API. It utilizes git submodules to manage the llama.cpp dependency.

Quick Start & Requirements

Install: git clone --recurse-submodules https://github.com/go-skynet/go-llama.cpp
Build: make libbinding.a
Run example: LIBRARY_PATH=$PWD C_INCLUDE_PATH=$PWD go run ./examples -m "/model/path/here" -t 14
Dependencies: C/C++ compiler, Go toolchain.
Acceleration: Supports OpenBLAS, CuBLAS (CUDA), ROCm (HIPBLAS), OpenCL, and Metal (Apple Silicon) via build flags.
Docs: https://github.com/go-skynet/go-llama.cpp
Examples: https://github.com/go-skynet/go-llama.cpp/tree/master/examples

Highlighted Details

Supports GGUF file format exclusively (post-PR #180).
Offers build configurations for various hardware acceleration backends (OpenBLAS, CuBLAS, ROCm, OpenCL, Metal).
GPU offloading is configurable via the -ngl flag.

Maintenance & Community

Actively maintained, with recent merges indicating ongoing development.
No explicit community links (Discord/Slack) are provided in the README.

Licensing & Compatibility

License: MIT.
Compatible with commercial and closed-source applications.

Limitations & Caveats

Only supports the GGUF model format; GGML format requires using a specific older tag.
Building with specific acceleration backends may require additional system-level libraries and configurations.

Health Check

Last Commit

3 days ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

18 stars in the last 30 days