alpaca-lora  by tloen

LoRA fine-tuning for LLaMA

created 2 years ago
18,930 stars

Top 2.4% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides code and weights for fine-tuning the LLaMA language model using Low-Rank Adaptation (LoRA) to achieve instruction-following capabilities comparable to text-davinci-003. It targets researchers and developers aiming to run powerful language models on consumer hardware, offering a cost-effective and efficient method for customization.

How It Works

The project leverages Hugging Face's PEFT library and Tim Dettmers' bitsandbytes for efficient fine-tuning. LoRA injects trainable low-rank matrices into the transformer layers, significantly reducing the number of parameters to update. This approach allows for rapid training on a single consumer GPU (e.g., RTX 4090) within hours, while maintaining high-quality outputs comparable to larger, fully fine-tuned models.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt (ensure bitsandbytes is compatible or install from source).
  • Training requires a base LLaMA model (e.g., decapoda-research/llama-7b-hf) and a dataset (e.g., yahma/alpaca-cleaned).
  • Inference requires a base LLaMA model and LoRA weights (e.g., tloen/alpaca-lora-7b).
  • Docker support is available for both training and inference.
  • Official quick-start and inference examples are provided in the README.

Highlighted Details

  • Achieves instruction-following quality comparable to text-davinci-003.
  • Training completes within hours on a single RTX 4090.
  • Supports LLaMA models of various sizes (7B, 13B, 30B, 65B).
  • Includes scripts for exporting LoRA weights to standard Hugging Face format for use with projects like llama.cpp.

Maintenance & Community

  • The project has an active Discord server for discussion and support.
  • Weights on Hugging Face Hub are updated regularly.
  • Various community-contributed adapters for different languages and datasets are linked.

Licensing & Compatibility

  • The code is released under the Apache 2.0 license.
  • The dataset used (Stanford Alpaca) is under the ODC Attribution License.
  • Compatibility with commercial or closed-source projects is generally good due to the Apache 2.0 license for the code, but users should verify the licenses of the base LLaMA model and any specific datasets or weights used.

Limitations & Caveats

The README notes that model performance could be significantly improved with a higher-quality dataset. Users facing issues with response lengths should ensure they are using the latest code and weights.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
100 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
3 more.

LLaMA-Adapter by OpenGVLab

0.0%
6k
Efficient fine-tuning for instruction-following LLaMA models
created 2 years ago
updated 1 year ago
Feedback? Help us improve.