huggingface-llama-recipes  by huggingface

Recipes for Llama 3 models

created 1 year ago
678 stars

Top 51.0% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides minimal, runnable examples for quickly getting started with Meta's Llama 3.x family of models, including Llama 3.1, 3.2, and 3.3. It targets developers and researchers looking to experiment with Llama models for inference, fine-tuning, and advanced use cases like assisted decoding and RAG, offering a practical entry point to these powerful LLMs.

How It Works

The recipes leverage the Hugging Face transformers library for seamless integration with Llama models. It demonstrates core functionalities such as text generation via pipeline, local inference with various quantization techniques (4-bit, 8-bit, AWQ, GPTQ), and fine-tuning using PEFT and TRL. The approach emphasizes practical, code-first examples for rapid adoption and experimentation.

Quick Start & Requirements

  • Install: pip install -U transformers
  • Prerequisites: CUDA-enabled GPU recommended for optimal performance. Access to Llama models requires accepting their license and requesting permission via Hugging Face.
  • Resources: Memory requirements vary significantly by model size and quantization (e.g., Llama 3.1 8B in 4-bit requires ~4GB).
  • Links: Hugging Face announcement blog post (3.1), Open Source AI Cookbook

Highlighted Details

  • Supports Llama 3.1, 3.2, and 3.3 variants, including the large 405B parameter model.
  • Demonstrates advanced techniques like assisted decoding for up to 2x speedup and integration with Llama Guard for safety.
  • Includes recipes for fine-tuning on custom datasets and building RAG pipelines.
  • Covers performance optimizations using torch.compile and KV cache quantization.

Maintenance & Community

This repository is actively maintained by Hugging Face. Further community engagement and updates can be found via Hugging Face's official channels.

Licensing & Compatibility

The recipes themselves are likely under a permissive license (e.g., Apache 2.0), but the use of Llama models is governed by Meta's Llama license, which may have restrictions on commercial use and redistribution.

Limitations & Caveats

The repository is explicitly marked as "WIP" (Work In Progress), indicating potential for frequent changes and instability. Access to Llama models is gated by Meta's approval process.

Health Check
Last commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
37 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), and
10 more.

open_llama by openlm-research

0.0%
8k
Open-source reproduction of LLaMA models
created 2 years ago
updated 2 years ago
Feedback? Help us improve.