distill-sd  by segmind

Diffusion model distillation for smaller, faster Stable Diffusion

Created 2 years ago
610 stars

Top 53.8% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides knowledge-distilled, smaller versions of Stable Diffusion models, offering up to 50% reduction in size and faster inference. It's targeted at researchers and developers looking to optimize Stable Diffusion for resource-constrained environments or faster iteration cycles, enabling high-quality image generation with reduced computational overhead.

How It Works

The project implements knowledge distillation, where a smaller "student" U-Net model learns to mimic the outputs of a larger "teacher" U-Net (specifically SG161222/Realistic_Vision_V4.0). The training loss combines MSE between predicted noise and actual noise, and MSE between the final outputs and intermediate block outputs of the teacher and student models. This multi-level distillation approach aims to preserve image quality while significantly reducing model size and inference time.

Quick Start & Requirements

  • Install/Run: Use diffusers library for inference. Example Python snippet provided.
  • Prerequisites: PyTorch, diffusers library. GPU with CUDA recommended for inference. Training requires accelerate and potentially large datasets.
  • Links: Huggingface repo for small-sd, Huggingface repo for tiny-sd

Highlighted Details

  • Achieves up to 100% faster inference and up to 30% lower VRAM footprint compared to standard Stable Diffusion.
  • Offers "sd_small" (579M parameters) and "sd_tiny" (323M parameters) variants.
  • Training scripts for knowledge distillation, checkpoint fine-tuning, and LoRA training are included.
  • Pre-trained checkpoints for general use and fine-tuned on portrait images are available.

Maintenance & Community

  • The project is associated with Segmind.
  • Mentions research papers (BK-SDM, ICML Workshop) as the basis for the work.

Licensing & Compatibility

  • The repository itself does not explicitly state a license. The underlying diffusers library is typically under the MIT license. Pre-trained models on Hugging Face may have their own licenses.

Limitations & Caveats

The distilled models are in an early phase and may not yet achieve production-quality general outputs. They are best suited for fine-tuning or LoRA training on specific concepts/styles and may struggle with composability or multi-concept generation. A note mentions a potential issue with config.json when resuming from checkpoints, requiring manual replacement.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), and
1 more.

awesome-knowledge-distillation by dkozlov

0.1%
4k
Collection of knowledge distillation resources
Created 8 years ago
Updated 3 months ago
Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), and
15 more.

torchtune by pytorch

0.2%
5k
PyTorch library for LLM post-training and experimentation
Created 1 year ago
Updated 1 day ago
Feedback? Help us improve.