llama3  by meta-llama

*Deprecated* minimal example for loading and running Llama 3 models

created 1 year ago
28,867 stars

Top 1.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a minimal example for loading and running inference with Meta's Llama 3 large language models, available in 8B and 70B parameter sizes. It targets developers, researchers, and businesses seeking to integrate advanced LLM capabilities into their applications, offering a starting point for experimentation and scaling.

How It Works

The project leverages PyTorch and CUDA for efficient inference. It provides scripts for loading pre-trained and instruction-tuned models, with specific formatting requirements for chat-based interactions. The architecture supports model parallelism (MP) for scaling inference across multiple GPUs, with MP values of 1 for 8B models and 8 for 70B models.

Quick Start & Requirements

  • Install: Clone the repo and run pip install -e . within a Conda environment with PyTorch/CUDA.
  • Model Weights: Download via a signed URL from the Meta Llama website or from Hugging Face (e.g., huggingface-cli download meta-llama/Meta-Llama-3-8B-Instruct --include "original/*").
  • Prerequisites: wget, md5sum, PyTorch, CUDA.
  • Run Inference: torchrun --nproc_per_node <MP_value> example_chat_completion.py --ckpt_dir <model_path> --tokenizer_path <tokenizer_path> ...
  • Resources: Requires significant GPU memory, especially for the 70B model.
  • Docs: Meta Llama website, Llama Cookbook.

Highlighted Details

  • Supports sequence lengths up to 8192 tokens.
  • Offers both pre-trained and instruction-tuned models.
  • Includes guidance on responsible use and safety classifiers.
  • Provides specific chat formatting for instruction-tuned models.

Maintenance & Community

This repository is marked as deprecated, with functionality migrated to several new repositories: llama-models, PurpleLlama, llama-toolchain, llama-agentic-system, and llama-cookbook. Users are directed to these new repos for ongoing development and support.

Licensing & Compatibility

The models and weights are licensed for research and commercial entities, with an accompanying Acceptable Use Policy.

Limitations & Caveats

This repository is deprecated and serves only as a minimal example. All active development and support have moved to newer, specialized repositories within the Llama Stack.

Health Check
Last commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
403 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Travis Fischer Travis Fischer(Founder of Agentic), and
6 more.

codellama by meta-llama

0.1%
16k
Inference code for CodeLlama models
created 1 year ago
updated 11 months ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Ying Sheng Ying Sheng(Author of SGLang), and
9 more.

alpaca-lora by tloen

0.0%
19k
LoRA fine-tuning for LLaMA
created 2 years ago
updated 1 year ago
Feedback? Help us improve.