lora by cloneofsimo

LoRA tool for fast diffusion fine-tuning

Created 3 years ago

7,508 stars

Top 6.8% on SourcePulse

View on GitHub

13 Experts Love This Project

Vincent Weisser

Cofounder of Prime Intellect

Jesse Clark

Cofounder of Marqo

Shawn Wang

Editor of Latent Space

Eugene Yan

AI Scientist at AWS

and 9 more!

Project Summary

This repository provides a method for efficiently fine-tuning diffusion models, specifically Stable Diffusion, for text-to-image generation. It targets users who want to customize models with their own datasets, offering significantly faster training and much smaller output files compared to full fine-tuning, enabling easier sharing and experimentation.

How It Works

The core innovation is Low-Rank Adaptation (LoRA), which injects trainable low-rank matrices into the pre-trained model's weights. Instead of updating the entire weight matrix $W$, LoRA trains smaller matrices $A$ and $B$ such that $\Delta W = AB^T$. This drastically reduces the number of trainable parameters, leading to faster training and compact model outputs (1MB-6MB). The method can be applied to the UNet, Text Encoder, or both, and integrates with techniques like Dreambooth and Pivotal Tuning Inversion for enhanced results.

Quick Start & Requirements

Install: pip install git+https://github.com/cloneofsimo/lora.git
Prerequisites: Python, PyTorch, Hugging Face diffusers, transformers, accelerate, xformers (recommended for performance). CUDA-enabled GPU is highly recommended for practical training.
Example CLI: lora_pti --pretrained_model_name_or_path=... --instance_data_dir=... --output_dir=... (see README for full parameters).
Demo: Integrated into Huggingface Spaces via Gradio.
Colab: Example notebooks are available.

Highlighted Details

Fine-tunes Stable Diffusion models up to twice as fast as Dreambooth.
Generates very small LoRA files (1MB-6MB), ideal for sharing.
Supports fine-tuning UNet, Text Encoder, and CLIP.
Offers merging capabilities for combining multiple LoRAs or merging LoRAs with base models.
Integrates Pivotal Tuning Inversion for improved results.
Supports safetensor format and xformers for performance.

Maintenance & Community

The project is actively developed, with frequent updates noted in the README. It's integrated into the Hugging Face diffusers library. Community discussions and tips are encouraged via PRs.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. However, its integration with Hugging Face diffusers suggests compatibility with common open-source workflows. Users should verify licensing for commercial use.

Limitations & Caveats

The README mentions that extensive comparisons for performance against full fine-tuning are future work. Some features like Kronecker product adaptation and time-aware fine-tuning are listed as TODOs. User-friendliness for non-programmers and documentation improvements are also noted as areas for development.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

24 stars in the last 30 days