lora  by cloneofsimo

LoRA tool for fast diffusion fine-tuning

Created 3 years ago
7,529 stars

Top 6.9% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a method for efficiently fine-tuning diffusion models, specifically Stable Diffusion, for text-to-image generation. It targets users who want to customize models with their own datasets, offering significantly faster training and much smaller output files compared to full fine-tuning, enabling easier sharing and experimentation.

How It Works

The core innovation is Low-Rank Adaptation (LoRA), which injects trainable low-rank matrices into the pre-trained model's weights. Instead of updating the entire weight matrix $W$, LoRA trains smaller matrices $A$ and $B$ such that $\Delta W = AB^T$. This drastically reduces the number of trainable parameters, leading to faster training and compact model outputs (1MB-6MB). The method can be applied to the UNet, Text Encoder, or both, and integrates with techniques like Dreambooth and Pivotal Tuning Inversion for enhanced results.

Quick Start & Requirements

  • Install: pip install git+https://github.com/cloneofsimo/lora.git
  • Prerequisites: Python, PyTorch, Hugging Face diffusers, transformers, accelerate, xformers (recommended for performance). CUDA-enabled GPU is highly recommended for practical training.
  • Example CLI: lora_pti --pretrained_model_name_or_path=... --instance_data_dir=... --output_dir=... (see README for full parameters).
  • Demo: Integrated into Huggingface Spaces via Gradio.
  • Colab: Example notebooks are available.

Highlighted Details

  • Fine-tunes Stable Diffusion models up to twice as fast as Dreambooth.
  • Generates very small LoRA files (1MB-6MB), ideal for sharing.
  • Supports fine-tuning UNet, Text Encoder, and CLIP.
  • Offers merging capabilities for combining multiple LoRAs or merging LoRAs with base models.
  • Integrates Pivotal Tuning Inversion for improved results.
  • Supports safetensor format and xformers for performance.

Maintenance & Community

The project is actively developed, with frequent updates noted in the README. It's integrated into the Hugging Face diffusers library. Community discussions and tips are encouraged via PRs.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. However, its integration with Hugging Face diffusers suggests compatibility with common open-source workflows. Users should verify licensing for commercial use.

Limitations & Caveats

The README mentions that extensive comparisons for performance against full fine-tuning are future work. Some features like Kronecker product adaptation and time-aware fine-tuning are listed as TODOs. User-friendliness for non-programmers and documentation improvements are also noted as areas for development.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
9 stars in the last 30 days

Explore Similar Projects

Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Rodrigo Nader Rodrigo Nader(Cofounder of Langflow), and
1 more.

DiffSynth-Studio by modelscope

0.4%
12k
Open-source project for diffusion model exploration
Created 2 years ago
Updated 4 days ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
5 more.

ai-toolkit by ostris

0.8%
10k
Training toolkit for finetuning diffusion models
Created 2 years ago
Updated 17 hours ago
Feedback? Help us improve.