llama by meta-llama

Inference code for Llama 2 models (deprecated)

Created 2 years ago

59,040 stars

Top 0.4% on SourcePulse

View on GitHub

43 Experts Love This Project

Roy Frostig

Coauthor of JAX; Research Scientist at Google DeepMind

Research Scientist at OpenAI

and 39 more!

Project Summary

This repository provides inference code for Meta's Llama models, specifically Llama 2. It's designed for researchers and businesses to load and run pre-trained and fine-tuned language models, ranging from 7B to 70B parameters, enabling experimentation and application development.

How It Works

The project utilizes PyTorch and a model-parallelism approach for efficient inference. It allows loading model weights and tokenizers, with specific scripts for text completion and chat-based interactions. The architecture supports varying model-parallel (MP) values depending on model size (7B=1, 13B=2, 70B=8) and allows customization of sequence length and batch size for hardware optimization.

Quick Start & Requirements

Install: pip install -e . within a conda environment with PyTorch/CUDA.
Prerequisites: wget, md5sum, PyTorch with CUDA support. Model weights must be downloaded separately from Meta's website after accepting their license.
Resources: Requires downloading model weights (size varies by parameter count).
Links: llama-models, llama-cookbook.

Highlighted Details

Supports Llama 2 models from 7B to 70B parameters.
Includes example scripts for both raw text completion and chat-based inference.
Fine-tuned chat models require specific prompt formatting (INST, <>, BOS, EOS tokens).
Model weights require a separate download process via a signed URL from Meta.

Maintenance & Community

This repository is deprecated in favor of a consolidated Llama Stack. New development and support are directed to llama-models, PurpleLlama, llama-toolchain, llama-agentic-system, and llama-cookbook. Issues can be filed on these new repositories.

Licensing & Compatibility

Model weights and code are licensed for both research and commercial entities. An Acceptable Use Policy is provided.

Limitations & Caveats

This repository is deprecated. Users are directed to use the new Llama Stack repositories for current development and support. The README notes that testing has not covered all potential use scenarios, and users should consult the Responsible Use Guide.

Health Check

Last Commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

141 stars in the last 30 days