svdiff-pytorch by mkshing

PyTorch implementation for diffusion fine-tuning via compact parameter space

Created 2 years ago

383 stars

Top 74.7% on SourcePulse

Project Summary

This repository provides a PyTorch implementation of SVDiff, a method for compact parameter space fine-tuning of diffusion models. It enables efficient single-subject generation and single-image editing with significantly reduced model sizes and faster training compared to LoRA.

How It Works

SVDiff fine-tunes diffusion models by learning low-rank spectral shifts in the parameter space, specifically targeting the U-Net and text encoder. This approach allows for a more compact representation of learned concepts, resulting in smaller checkpoint files (1.2MB vs. 3.1MB for LoRA) and fewer trainable parameters. The method is designed to achieve comparable or better results with fewer training steps.

Quick Start & Requirements

Install via pip: pip install svdiff-pytorch
Alternatively, clone the repo and install requirements: git clone https://github.com/mkshing/svdiff-pytorch && pip install -r requirements.txt
Requires PyTorch and Hugging Face diffusers.
GPU with CUDA is recommended for training and inference.
Official documentation and a Gradio UI demo are available.

Highlighted Details

Achieves 0.5M fewer trainable parameters and a 1.2MB file size compared to LoRA (3.1MB).
Supports single-subject generation and single-image editing without DDIM inversion.
Offers optional integration with ToMe for faster prior generation during training.
Includes a Gradio UI for both training and inference.

Maintenance & Community

The project was last updated in April 2023.
Key dependencies include Hugging Face diffusers.
Links to the original paper and related works (LoRA, ToMe) are provided.

Licensing & Compatibility

The repository does not explicitly state a license in the README.
Compatibility with commercial or closed-source projects is not specified.

Limitations & Caveats

The project appears to be a snapshot from April 2023, and future maintenance or updates are not guaranteed.
Features like "Support multiple spectral shifts" and "SVDiff + LoRA" are marked as TODO.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

1 stars in the last 30 days

Explore Similar Projects

Starred by

Patrick von Platen

Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral).

BK-SDM by Nota-NetsPresso

Compressed Stable Diffusion research paper for efficient text-to-image generation

Created 2 years ago

Updated 1 year ago

piecewise-rectified-flow by magic-research

PeRFlow: Plug-and-play accelerator for diffusion models (NeurIPS 2024)

Created 1 year ago

Updated 4 months ago

Universal-Guided-Diffusion by arpitbansal297

PyTorch code for universal diffusion guidance

Created 2 years ago

Updated 2 years ago

InstaFlow by gnobitab

One-step image generator using Rectified Flow (ICLR 2024)

Created 2 years ago

Updated 1 year ago

kandinsky-5 by kandinskylab

Advanced diffusion models for versatile video and image generation

Created 5 months ago

Updated 1 week ago

stable-diffusion-pytorch by kjsman

PyTorch SDK for Stable Diffusion

Created 3 years ago

Updated 1 year ago

ddpo-pytorch by kvablack

PyTorch implementation of DDPO for diffusion model finetuning

Created 2 years ago

Updated 1 year ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind).

RPG-DiffusionMaster by YangLing0818

Training-free paradigm for text-to-image generation/editing

Created 2 years ago

Updated 11 months ago

text2image-gui by n00mkrad

GUI for Stable Diffusion text-to-image generation

Created 3 years ago

Updated 3 weeks ago

Starred by

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI).

custom-diffusion by adobe-research

Text-to-image fine-tuning research paper

Created 3 years ago

Updated 1 month ago

Starred by

Deepak Pathak

Deepak Pathak(Cofounder of Skild AI; Professor at CMU),

Travis Fischer

Travis Fischer(Founder of Agentic), and

8 more.

sygil-webui by Sygil-Dev

Web UI for Stable Diffusion

Created 3 years ago

Updated 1 month ago

Starred by

Aravind Srinivas

Aravind Srinivas(Cofounder of Perplexity),

Junyang Lin

Junyang Lin(Core Maintainer at Alibaba Qwen), and

8 more.

guided-diffusion by openai

Image synthesis codebase for diffusion models

Created 4 years ago

Updated 1 year ago

Feedback? Help us improve.