stanford_alpaca by tatsu-lab

Instruction-following LLaMA model training and data generation

Created 2 years ago

30,270 stars

Top 1.2% on SourcePulse

View on GitHub

30 Experts Love This Project

Andrej Karpathy

Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n

John Yang

Coauthor of SWE-bench, SWE-agent

Pawel Garbacki

Cofounder of Fireworks AI

Omar Sanseviero

DevRel at Google DeepMind

and 26 more!

Project Summary

Stanford Alpaca provides the code and data to train an instruction-following language model based on Meta's LLaMA. It targets researchers and developers looking to replicate or build upon instruction-tuned LLMs, offering a cost-effective method for generating diverse instruction data and fine-tuning LLaMA models.

How It Works

Alpaca fine-tunes LLaMA models using a 52K instruction-following dataset generated via the Self-Instruct methodology, with modifications to leverage text-davinci-003 for data creation and employ aggressive batch decoding for cost efficiency. This approach aims to produce a model with instruction-following capabilities comparable to text-davinci-003 at a significantly lower cost.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt
Requires an OpenAI API key for data generation.
Fine-tuning requires significant GPU resources (e.g., 4x A100 80G GPUs for LLaMA-7B).
Official documentation and guides for converting LLaMA weights to Hugging Face format are referenced.

Highlighted Details

52K instruction-following data generated for under $500.
Preliminary evaluation shows performance similar to text-davinci-003 on Self-Instruct benchmarks.
Supports fine-tuning LLaMA and OPT models using Hugging Face transformers and PyTorch FSDP or DeepSpeed for distributed training and memory optimization.
Provides instructions to recover Alpaca-7B weights from released weight diffs.

Maintenance & Community

The project is a research effort from Stanford University, with contributions from graduate students. Key advisors include Tatsunori B. Hashimoto, Percy Liang, and Carlos Guestrin.

Licensing & Compatibility

Dataset: CC BY NC 4.0 (Non-Commercial Use Only).
Weight diff: CC BY NC 4.0 (Non-Commercial Use Only).
Models trained using the dataset are restricted to research purposes.
Not licensed for commercial use.

Limitations & Caveats

The model is still under development and has not been fine-tuned for safety or harmlessness. Users are encouraged to exercise caution. The live demo is suspended.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

48 stars in the last 30 days