alpaca-lora by tloen

LoRA fine-tuning for LLaMA

Created 2 years ago

18,989 stars

Top 2.4% on SourcePulse

View on GitHub

27 Experts Love This Project

Junyang Lin

Core Maintainer at Alibaba Qwen

Vincent Weisser

Cofounder of Prime Intellect

Nir Gazit

Cofounder of Traceloop

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

and 23 more!

Project Summary

This repository provides code and weights for fine-tuning the LLaMA language model using Low-Rank Adaptation (LoRA) to achieve instruction-following capabilities comparable to text-davinci-003. It targets researchers and developers aiming to run powerful language models on consumer hardware, offering a cost-effective and efficient method for customization.

How It Works

The project leverages Hugging Face's PEFT library and Tim Dettmers' bitsandbytes for efficient fine-tuning. LoRA injects trainable low-rank matrices into the transformer layers, significantly reducing the number of parameters to update. This approach allows for rapid training on a single consumer GPU (e.g., RTX 4090) within hours, while maintaining high-quality outputs comparable to larger, fully fine-tuned models.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt (ensure bitsandbytes is compatible or install from source).
Training requires a base LLaMA model (e.g., decapoda-research/llama-7b-hf) and a dataset (e.g., yahma/alpaca-cleaned).
Inference requires a base LLaMA model and LoRA weights (e.g., tloen/alpaca-lora-7b).
Docker support is available for both training and inference.
Official quick-start and inference examples are provided in the README.

Highlighted Details

Achieves instruction-following quality comparable to text-davinci-003.
Training completes within hours on a single RTX 4090.
Supports LLaMA models of various sizes (7B, 13B, 30B, 65B).
Includes scripts for exporting LoRA weights to standard Hugging Face format for use with projects like llama.cpp.

Maintenance & Community

The project has an active Discord server for discussion and support.
Weights on Hugging Face Hub are updated regularly.
Various community-contributed adapters for different languages and datasets are linked.

Licensing & Compatibility

The code is released under the Apache 2.0 license.
The dataset used (Stanford Alpaca) is under the ODC Attribution License.
Compatibility with commercial or closed-source projects is generally good due to the Apache 2.0 license for the code, but users should verify the licenses of the base LLaMA model and any specific datasets or weights used.

Limitations & Caveats

The README notes that model performance could be significantly improved with a higher-quality dataset. Users facing issues with response lengths should ensure they are using the latest code and weights.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

25 stars in the last 30 days