Instruction-following LLaMA model training and data generation
Top 1.2% on sourcepulse
Stanford Alpaca provides the code and data to train an instruction-following language model based on Meta's LLaMA. It targets researchers and developers looking to replicate or build upon instruction-tuned LLMs, offering a cost-effective method for generating diverse instruction data and fine-tuning LLaMA models.
How It Works
Alpaca fine-tunes LLaMA models using a 52K instruction-following dataset generated via the Self-Instruct methodology, with modifications to leverage text-davinci-003
for data creation and employ aggressive batch decoding for cost efficiency. This approach aims to produce a model with instruction-following capabilities comparable to text-davinci-003
at a significantly lower cost.
Quick Start & Requirements
pip install -r requirements.txt
Highlighted Details
text-davinci-003
on Self-Instruct benchmarks.transformers
and PyTorch FSDP or DeepSpeed for distributed training and memory optimization.Maintenance & Community
The project is a research effort from Stanford University, with contributions from graduate students. Key advisors include Tatsunori B. Hashimoto, Percy Liang, and Carlos Guestrin.
Licensing & Compatibility
Limitations & Caveats
The model is still under development and has not been fine-tuned for safety or harmlessness. Users are encouraged to exercise caution. The live demo is suspended.
1 year ago
1 day