llama3  by meta-llama

*Deprecated* minimal example for loading and running Llama 3 models

Created 1 year ago
28,979 stars

Top 1.3% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a minimal example for loading and running inference with Meta's Llama 3 large language models, available in 8B and 70B parameter sizes. It targets developers, researchers, and businesses seeking to integrate advanced LLM capabilities into their applications, offering a starting point for experimentation and scaling.

How It Works

The project leverages PyTorch and CUDA for efficient inference. It provides scripts for loading pre-trained and instruction-tuned models, with specific formatting requirements for chat-based interactions. The architecture supports model parallelism (MP) for scaling inference across multiple GPUs, with MP values of 1 for 8B models and 8 for 70B models.

Quick Start & Requirements

  • Install: Clone the repo and run pip install -e . within a Conda environment with PyTorch/CUDA.
  • Model Weights: Download via a signed URL from the Meta Llama website or from Hugging Face (e.g., huggingface-cli download meta-llama/Meta-Llama-3-8B-Instruct --include "original/*").
  • Prerequisites: wget, md5sum, PyTorch, CUDA.
  • Run Inference: torchrun --nproc_per_node <MP_value> example_chat_completion.py --ckpt_dir <model_path> --tokenizer_path <tokenizer_path> ...
  • Resources: Requires significant GPU memory, especially for the 70B model.
  • Docs: Meta Llama website, Llama Cookbook.

Highlighted Details

  • Supports sequence lengths up to 8192 tokens.
  • Offers both pre-trained and instruction-tuned models.
  • Includes guidance on responsible use and safety classifiers.
  • Provides specific chat formatting for instruction-tuned models.

Maintenance & Community

This repository is marked as deprecated, with functionality migrated to several new repositories: llama-models, PurpleLlama, llama-toolchain, llama-agentic-system, and llama-cookbook. Users are directed to these new repos for ongoing development and support.

Licensing & Compatibility

The models and weights are licensed for research and commercial entities, with an accompanying Acceptable Use Policy.

Limitations & Caveats

This repository is deprecated and serves only as a minimal example. All active development and support have moved to newer, specialized repositories within the Llama Stack.

Health Check
Last Commit

7 months ago

Responsiveness

1 week

Pull Requests (30d)
2
Issues (30d)
1
Star History
111 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), and
20 more.

TinyLlama by jzhang38

0.1%
9k
Tiny pretraining project for a 1.1B Llama model
Created 2 years ago
Updated 1 year ago
Starred by Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
25 more.

alpaca-lora by tloen

0.0%
19k
LoRA fine-tuning for LLaMA
Created 2 years ago
Updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), John Yang John Yang(Coauthor of SWE-bench, SWE-agent), and
28 more.

stanford_alpaca by tatsu-lab

0.1%
30k
Instruction-following LLaMA model training and data generation
Created 2 years ago
Updated 1 year ago
Starred by Roy Frostig Roy Frostig(Coauthor of JAX; Research Scientist at Google DeepMind), Zhiqiang Xie Zhiqiang Xie(Coauthor of SGLang), and
40 more.

llama by meta-llama

0.1%
59k
Inference code for Llama 2 models (deprecated)
Created 2 years ago
Updated 7 months ago
Feedback? Help us improve.