llm  by rustformers

Rust ecosystem for LLM Rust inference (unmaintained)

Created 2 years ago
6,129 stars

Top 8.5% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an ecosystem of Rust libraries for working with large language models (LLMs), built on the GGML tensor library. It targets developers and end-users seeking efficient, Rust-native LLM inference, offering a CLI for direct interaction and a crate for programmatic use.

How It Works

The core of the project leverages the GGML tensor library, aiming to bring Rust's robustness and ease of use to LLM inference. It supports various model architectures and quantization methods, with an initial focus on CPU inference, though GPU acceleration (CUDA, Metal) was a planned feature.

Quick Start & Requirements

  • Install CLI from source: cargo install --git https://github.com/rustformers/llm llm-cli
  • Project requires Rust v1.65.0+ and a modern C toolchain.
  • GPU support (CUDA, OpenCL, Metal) requires specific build configurations and documentation.
  • Links: Docs.rs, GitHub Releases

Highlighted Details

  • Supports BLOOM, GPT-2, GPT-J, GPT-NeoX (StableLM, RedPajama, Dolly 2.0), LLaMA (Alpaca, Vicuna, Koala, GPT4All, Wizard), and MPT models.
  • CLI offers infer, REPL, chat modes, model serialization, quantization, and perplexity computation.
  • Supports remote fetching of tokenizers from Hugging Face.
  • Bindings available for Python and Node.js.

Maintenance & Community

  • ARCHIVED: The project is unmaintained due to lack of time and resources.
  • Recommendations are provided for active alternatives like Ratchet, Candle-based libraries (mistral.rs, kalosm, candle-transformers), and llama.cpp wrappers.
  • Community contact: Discord.

Licensing & Compatibility

  • The README does not explicitly state a license. The project's nature suggests it would likely be MIT or Apache 2.0, but this requires verification.
  • Compatibility for commercial use is not specified.

Limitations & Caveats

The project is archived and no longer actively maintained. The released version (0.1.1) is significantly out of date. The main and gguf branches are also outdated and do not support GGUF or the latest GGML versions. The develop branch, intended to sync with the latest GGML and support GGUF, was not completed.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
11 stars in the last 30 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP AI at Vercel; Founder of Turborepo; Author of Formik, TSDX), Eugene Yan Eugene Yan(AI Scientist at AWS), and
2 more.

starcoder.cpp by bigcode-project

0%
456
C++ example for StarCoder inference
Created 2 years ago
Updated 2 years ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
4 more.

gemma_pytorch by google

0.2%
6k
PyTorch implementation for Google's Gemma models
Created 1 year ago
Updated 3 months ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Lei Zhang Lei Zhang(Director Engineering AI at AMD), and
23 more.

gpt-fast by meta-pytorch

0.2%
6k
PyTorch text generation for efficient transformer inference
Created 1 year ago
Updated 3 weeks ago
Starred by Bojan Tunguz Bojan Tunguz(AI Scientist; Formerly at NVIDIA), Alex Chen Alex Chen(Cofounder of Nexa AI), and
19 more.

ggml by ggml-org

0.3%
13k
Tensor library for machine learning
Created 3 years ago
Updated 2 days ago
Feedback? Help us improve.