llama3 by meta-llama

*Deprecated* minimal example for loading and running Llama 3 models

Created 1 year ago

29,178 stars

Top 1.2% on SourcePulse

View on GitHub

18 Experts Love This Project

Tobi Lutke

Cofounder of Shopify

Mckay Wrigley

Founder of Takeoff AI

Omar Sanseviero

DevRel at Google DeepMind

Simon Willison

Coauthor of Django

and 14 more!

Project Summary

This repository provides a minimal example for loading and running inference with Meta's Llama 3 large language models, available in 8B and 70B parameter sizes. It targets developers, researchers, and businesses seeking to integrate advanced LLM capabilities into their applications, offering a starting point for experimentation and scaling.

How It Works

The project leverages PyTorch and CUDA for efficient inference. It provides scripts for loading pre-trained and instruction-tuned models, with specific formatting requirements for chat-based interactions. The architecture supports model parallelism (MP) for scaling inference across multiple GPUs, with MP values of 1 for 8B models and 8 for 70B models.

Quick Start & Requirements

Install: Clone the repo and run pip install -e . within a Conda environment with PyTorch/CUDA.
Model Weights: Download via a signed URL from the Meta Llama website or from Hugging Face (e.g., huggingface-cli download meta-llama/Meta-Llama-3-8B-Instruct --include "original/*").
Prerequisites: wget, md5sum, PyTorch, CUDA.
Run Inference: torchrun --nproc_per_node <MP_value> example_chat_completion.py --ckpt_dir <model_path> --tokenizer_path <tokenizer_path> ...
Resources: Requires significant GPU memory, especially for the 70B model.
Docs: Meta Llama website, Llama Cookbook.

Highlighted Details

Supports sequence lengths up to 8192 tokens.
Offers both pre-trained and instruction-tuned models.
Includes guidance on responsible use and safety classifiers.
Provides specific chat formatting for instruction-tuned models.

Maintenance & Community

This repository is marked as deprecated, with functionality migrated to several new repositories: llama-models, PurpleLlama, llama-toolchain, llama-agentic-system, and llama-cookbook. Users are directed to these new repos for ongoing development and support.

Licensing & Compatibility

The models and weights are licensed for research and commercial entities, with an accompanying Acceptable Use Policy.

Limitations & Caveats

This repository is deprecated and serves only as a minimal example. All active development and support have moved to newer, specialized repositories within the Llama Stack.

Health Check

Last Commit

11 months ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

97 stars in the last 30 days