rust-llama.cpp by mdrokz

Rust bindings for llama.cpp

Created 2 years ago

410 stars

Top 71.2% on SourcePulse

View on GitHub

1 Expert Loves This Project

Georgios Konstantopoulos

CTO, General Partner at Paradigm

Project Summary

This project provides Rust bindings for the popular llama.cpp library, enabling developers to integrate large language model inference directly into Rust applications. It targets Rust developers seeking efficient, native LLM capabilities without relying on external processes or C APIs.

How It Works

The bindings wrap the core llama.cpp C++ library using Rust's FFI (Foreign Function Interface). This approach leverages llama.cpp's optimized C++ implementation for performance while offering a idiomatic Rust interface. The project utilizes git submodules to manage the llama.cpp dependency, ensuring version consistency.

Quick Start & Requirements

Install via cargo add llama_cpp_rs.
Requires a pre-downloaded .ggmlv3 or .gguf model file.
Building locally requires git clone --recurse-submodules.
See examples for usage.

Highlighted Details

Supports GGUF model format.
Includes experimental GPU support (Metal).
Work in progress to add CUDA, OpenBLAS, and OpenCL support.
Provides a token callback mechanism for streaming output.

Maintenance & Community

The project is actively maintained by mdrokz. There are no explicit community channels or roadmap links provided in the README.

Licensing & Compatibility

Licensed under the MIT license, permitting commercial use and integration into closed-source projects.

Limitations & Caveats

GPU acceleration beyond Metal is still under development. The project is missing comprehensive test cases and features like HTTP/S3 model fetching are planned but not yet implemented.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days