stanford_alpaca  by tatsu-lab

Instruction-following LLaMA model training and data generation

created 2 years ago
30,096 stars

Top 1.2% on sourcepulse

GitHubView on GitHub
Project Summary

Stanford Alpaca provides the code and data to train an instruction-following language model based on Meta's LLaMA. It targets researchers and developers looking to replicate or build upon instruction-tuned LLMs, offering a cost-effective method for generating diverse instruction data and fine-tuning LLaMA models.

How It Works

Alpaca fine-tunes LLaMA models using a 52K instruction-following dataset generated via the Self-Instruct methodology, with modifications to leverage text-davinci-003 for data creation and employ aggressive batch decoding for cost efficiency. This approach aims to produce a model with instruction-following capabilities comparable to text-davinci-003 at a significantly lower cost.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Requires an OpenAI API key for data generation.
  • Fine-tuning requires significant GPU resources (e.g., 4x A100 80G GPUs for LLaMA-7B).
  • Official documentation and guides for converting LLaMA weights to Hugging Face format are referenced.

Highlighted Details

  • 52K instruction-following data generated for under $500.
  • Preliminary evaluation shows performance similar to text-davinci-003 on Self-Instruct benchmarks.
  • Supports fine-tuning LLaMA and OPT models using Hugging Face transformers and PyTorch FSDP or DeepSpeed for distributed training and memory optimization.
  • Provides instructions to recover Alpaca-7B weights from released weight diffs.

Maintenance & Community

The project is a research effort from Stanford University, with contributions from graduate students. Key advisors include Tatsunori B. Hashimoto, Percy Liang, and Carlos Guestrin.

Licensing & Compatibility

  • Dataset: CC BY NC 4.0 (Non-Commercial Use Only).
  • Weight diff: CC BY NC 4.0 (Non-Commercial Use Only).
  • Models trained using the dataset are restricted to research purposes.
  • Not licensed for commercial use.

Limitations & Caveats

The model is still under development and has not been fine-tuned for safety or harmlessness. Users are encouraged to exercise caution. The live demo is suspended.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
223 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Travis Fischer Travis Fischer(Founder of Agentic), and
6 more.

codellama by meta-llama

0.1%
16k
Inference code for CodeLlama models
created 1 year ago
updated 11 months ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Ying Sheng Ying Sheng(Author of SGLang), and
9 more.

alpaca-lora by tloen

0.0%
19k
LoRA fine-tuning for LLaMA
created 2 years ago
updated 1 year ago
Feedback? Help us improve.