stanford_alpaca  by tatsu-lab

Instruction-following LLaMA model training and data generation

Created 2 years ago
30,205 stars

Top 1.2% on SourcePulse

GitHubView on GitHub
Project Summary

Stanford Alpaca provides the code and data to train an instruction-following language model based on Meta's LLaMA. It targets researchers and developers looking to replicate or build upon instruction-tuned LLMs, offering a cost-effective method for generating diverse instruction data and fine-tuning LLaMA models.

How It Works

Alpaca fine-tunes LLaMA models using a 52K instruction-following dataset generated via the Self-Instruct methodology, with modifications to leverage text-davinci-003 for data creation and employ aggressive batch decoding for cost efficiency. This approach aims to produce a model with instruction-following capabilities comparable to text-davinci-003 at a significantly lower cost.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Requires an OpenAI API key for data generation.
  • Fine-tuning requires significant GPU resources (e.g., 4x A100 80G GPUs for LLaMA-7B).
  • Official documentation and guides for converting LLaMA weights to Hugging Face format are referenced.

Highlighted Details

  • 52K instruction-following data generated for under $500.
  • Preliminary evaluation shows performance similar to text-davinci-003 on Self-Instruct benchmarks.
  • Supports fine-tuning LLaMA and OPT models using Hugging Face transformers and PyTorch FSDP or DeepSpeed for distributed training and memory optimization.
  • Provides instructions to recover Alpaca-7B weights from released weight diffs.

Maintenance & Community

The project is a research effort from Stanford University, with contributions from graduate students. Key advisors include Tatsunori B. Hashimoto, Percy Liang, and Carlos Guestrin.

Licensing & Compatibility

  • Dataset: CC BY NC 4.0 (Non-Commercial Use Only).
  • Weight diff: CC BY NC 4.0 (Non-Commercial Use Only).
  • Models trained using the dataset are restricted to research purposes.
  • Not licensed for commercial use.

Limitations & Caveats

The model is still under development and has not been fine-tuned for safety or harmlessness. Users are encouraged to exercise caution. The live demo is suspended.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
61 stars in the last 30 days

Explore Similar Projects

Starred by Jiaming Song Jiaming Song(Chief Scientist at Luma AI), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

LLaMA-Adapter by OpenGVLab

0.0%
6k
Efficient fine-tuning for instruction-following LLaMA models
Created 2 years ago
Updated 1 year ago
Starred by Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
25 more.

alpaca-lora by tloen

0.0%
19k
LoRA fine-tuning for LLaMA
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.