llama3.np  by likejazz

NumPy implementation for Llama 3 model

created 1 year ago
987 stars

Top 38.3% on sourcepulse

GitHubView on GitHub
Project Summary

llama3.np provides a pure NumPy implementation of the Llama 3 model, targeting researchers and developers interested in understanding LLM internals without heavy dependencies. It offers a clear, educational approach to Llama 3's architecture and inference, enabling experimentation and learning on CPU.

How It Works

This project implements Llama 3 using only NumPy, a fundamental Python library for numerical operations. This approach allows for a deep dive into the model's architecture and inference process, making it accessible for those who want to understand the mechanics of LLMs at a fundamental level. The implementation is validated against Andrej Karpathy's stories15M model.

Quick Start & Requirements

  • Primary install / run command: python llama3.py "Your prompt"
  • Prerequisites: Python, NumPy. No GPU or CUDA required.
  • Links: Official GitHub Repo

Highlighted Details

  • Pure NumPy implementation for Llama 3.
  • Validated against the stories15M model.
  • Educational focus on LLM architecture and inference.
  • CPU-based execution.

Maintenance & Community

The project is authored by Sang Park. It references and is inspired by llama2.c, llama.np, and Hugging Face's Transformers.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive for commercial use and integration with closed-source projects.

Limitations & Caveats

As a pure NumPy implementation, performance will be significantly lower than optimized GPU or C/CUDA versions. It is intended for educational purposes and understanding, not for production-scale inference.

Health Check
Last commit

3 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.