llama2-burn  by Gadersd

Llama2 port to Rust's Burn framework

created 2 years ago
280 stars

Top 93.9% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a port of Meta's Llama2 large language model to the Rust-based Burn deep learning framework. It enables Rust developers to leverage Llama2's capabilities by converting and loading model weights, facilitating inference and experimentation within the Rust ecosystem.

How It Works

The project utilizes Python scripts to load the original Llama2 model weights and tokenizer, then dumps them into a format suitable for conversion. Rust binaries then take these dumped weights, convert them into Burn's internal model format, and provide functionalities for testing inference and generating text samples. This two-stage process (Python for initial loading/dumping, Rust for conversion/inference) bridges the gap between PyTorch-based models and the Burn framework.

Quick Start & Requirements

  • Install/Run: Requires Rust toolchain. Python scripts are executed via python3. Rust binaries via cargo run.
  • Prerequisites: Llama2 model files (downloaded from Meta or Hugging Face), tokenizer.model file.
  • GPU: Optional, requires TORCH_CUDA_VERSION environment variable set (e.g., export TORCH_CUDA_VERSION=cu113).
  • Resources: CPU RAM is critical for loading and converting weights.
  • Docs: Usage examples provided in README.

Highlighted Details

  • Port of Llama2 to the Rust Burn framework.
  • Python scripts for initial model loading, testing, and weight dumping.
  • Rust binaries for weight conversion, inference testing, and text sampling.
  • Supports CPU and GPU inference.

Maintenance & Community

  • Open to contributions via pull requests.
  • No specific community channels or notable contributors mentioned.

Licensing & Compatibility

  • Licensed under the terms specified in the LICENSE file (likely MIT or Apache 2.0 based on typical Rust projects, but requires checking the file).
  • Compatibility for commercial use depends on the underlying Llama2 license and the project's specific license.

Limitations & Caveats

Weight conversion and loading are CPU-bound and can be resource-intensive, requiring significant RAM. The project appears to be a direct port, and performance benchmarks or advanced features may not be fully optimized or documented.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.