llama3.np by likejazz

NumPy implementation for Llama 3 model

Created 1 year ago

991 stars

Top 37.5% on SourcePulse

Project Summary

llama3.np provides a pure NumPy implementation of the Llama 3 model, targeting researchers and developers interested in understanding LLM internals without heavy dependencies. It offers a clear, educational approach to Llama 3's architecture and inference, enabling experimentation and learning on CPU.

How It Works

This project implements Llama 3 using only NumPy, a fundamental Python library for numerical operations. This approach allows for a deep dive into the model's architecture and inference process, making it accessible for those who want to understand the mechanics of LLMs at a fundamental level. The implementation is validated against Andrej Karpathy's stories15M model.

Quick Start & Requirements

Primary install / run command: python llama3.py "Your prompt"
Prerequisites: Python, NumPy. No GPU or CUDA required.
Links: Official GitHub Repo

Highlighted Details

Pure NumPy implementation for Llama 3.
Validated against the stories15M model.
Educational focus on LLM architecture and inference.
CPU-based execution.

Maintenance & Community

The project is authored by Sang Park. It references and is inspired by llama2.c, llama.np, and Hugging Face's Transformers.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive for commercial use and integration with closed-source projects.

Limitations & Caveats

As a pure NumPy implementation, performance will be significantly lower than optimized GPU or C/CUDA versions. It is intended for educational purposes and understanding, not for production-scale inference.

Health Check

Last Commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days