Instruction-following model based on OpenLLaMA
Top 89.3% on sourcepulse
OpenAlpaca provides fully open-source, instruction-following large language models based on OpenLLaMA. It targets researchers and developers seeking a permissively licensed alternative to proprietary models, enabling academic and commercial use with a relatively fast fine-tuning process.
How It Works
OpenAlpaca fine-tunes the OpenLLaMA base model using a dataset derived from databricks-dolly-15k
, filtered for length. The fine-tuning process employs specific prompt formats to guide the model in understanding and responding to instructions, with or without additional context. The project emphasizes its Apache 2.0 license for the model weights and CC BY-SA 3.0 for the data.
Quick Start & Requirements
pip install -r requirements.txt
pip install torch==1.13.1+cu117 -f https://download.pytorch.org/whl/torch/
Highlighted Details
process_dataset.py
is available.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The models are fine-tuned on "previewed" versions of OpenLLaMA, suggesting potential for improvement with newer base model releases. The project plans future rigorous evaluations.
2 years ago
1+ week