nano-llama31  by karpathy

Minimal Llama 3.1 implementation for training, finetuning, and inference

Created 1 year ago
1,423 stars

Top 28.6% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a minimal, dependency-free implementation of the Llama 3.1 architecture, inspired by the nanoGPT project. It aims to simplify training, fine-tuning, and inference for the Llama 3.1 8B base model, offering a cleaner alternative to the official Meta and Hugging Face releases. The project is actively developed and targets users who need a more streamlined and understandable codebase for working with Llama 3.1.

How It Works

The project replicates the Llama 3.1 architecture in a single PyTorch file (llama31.py), focusing on clarity and minimal dependencies. It achieves this by adapting and simplifying code from Meta's official release, ensuring functional parity through rigorous testing against the reference implementation. This approach allows for easier understanding and modification of the model's components.

Quick Start & Requirements

  • Install: Create a conda environment (conda create -n llama31 python=3.10, conda activate llama31), clone the official llama-models repo, download the Llama 3.1 8B model, install llama-models (pip install -r requirements.txt, pip install -e .), and then run inference with torchrun --nnodes 1 --nproc_per_node 1 reference.py --ckpt_dir <path_to_model> --tokenizer_path <path_to_model>.
  • Prerequisites: Python 3.10 (avoiding newer versions due to potential PyTorch compatibility issues), PyTorch, and access to download Llama 3.1 models from Meta (requires requesting access).
  • Resources: Downloading the 8B model requires ~16GB of disk space. Fine-tuning requires significant VRAM (e.g., 80GB GPU for RMSNorm training).
  • Links: Official Llama 3.1 Model Access

Highlighted Details

  • Minimal, dependency-free PyTorch implementation of Llama 3.1.
  • Verified functional parity with official Meta inference code.
  • Includes a fix for the trailing whitespace bug present in Meta's example_text_completion.py.
  • Early-stage fine-tuning capabilities demonstrated on the Tiny Stories dataset.

Maintenance & Community

Actively developed, but marked as "WIP" and "not ready for prime time." The README indicates ongoing work to add features, improve fine-tuning, and support chat models.

Licensing & Compatibility

The README does not explicitly state the license for this repository. It relies on Meta's official Llama 3.1 models, which have their own usage terms. Compatibility with commercial or closed-source projects would depend on the underlying Llama 3.1 license.

Limitations & Caveats

The project is explicitly marked as "WIP" and "not ready for prime time." Fine-tuning is still considered broken, with specific issues noted regarding attention masking for BOS tokens and KV cache usage during training. Support for models larger than 8B and chat models is pending. A warning about deprecated set_default_tensor_type is present.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), and
20 more.

TinyLlama by jzhang38

0.1%
9k
Tiny pretraining project for a 1.1B Llama model
Created 2 years ago
Updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), John Yang John Yang(Coauthor of SWE-bench, SWE-agent), and
28 more.

stanford_alpaca by tatsu-lab

0.1%
30k
Instruction-following LLaMA model training and data generation
Created 2 years ago
Updated 1 year ago
Starred by Roy Frostig Roy Frostig(Coauthor of JAX; Research Scientist at Google DeepMind), Zhiqiang Xie Zhiqiang Xie(Coauthor of SGLang), and
40 more.

llama by meta-llama

0.1%
59k
Inference code for Llama 2 models (deprecated)
Created 2 years ago
Updated 7 months ago
Feedback? Help us improve.