pytorch-stable-diffusion by hkproj

PyTorch code for Stable Diffusion image generation

Created 2 years ago

1,016 stars

Top 36.8% on SourcePulse

Project Summary

This repository provides a PyTorch implementation of Stable Diffusion, a text-to-image generation model. It is designed for researchers and developers who want to understand and experiment with the core components of Stable Diffusion without relying on higher-level libraries. The project allows users to generate images from text prompts using pre-trained Stable Diffusion models.

How It Works

The implementation is built from scratch in PyTorch, offering a direct look at the model's architecture and forward pass. It leverages pre-trained weights and tokenizer files, specifically mentioning compatibility with Stable Diffusion v1.5 and various fine-tuned models. The core functionality involves loading these components and processing text prompts to generate corresponding images.

Quick Start & Requirements

Install: Clone the repository and download necessary model weights (v1-5-pruned-emaonly.ckpt) and tokenizer files (vocab.json, merges.txt) into a data folder.
Prerequisites: PyTorch, Python. Specific versions are not stated but implied to be compatible with standard PyTorch installations.
Resources: Requires downloading model checkpoints, which can be several gigabytes.

Highlighted Details

Implemented from scratch in PyTorch for educational and research purposes.
Supports Stable Diffusion v1.5 and various fine-tuned models (e.g., InkPunk Diffusion, Illustration Diffusion).
Provides direct access to model components for deeper understanding.

Maintenance & Community

The project acknowledges several other Stable Diffusion implementations, including those from CompVis, divamgupta, kjsman, and Hugging Face's diffusers library, suggesting community awareness and potential for integration or comparison. No specific community channels or active maintenance signals are present in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking is therefore undetermined.

Limitations & Caveats

This implementation is a foundational PyTorch port and may lack the optimizations, features, or user-friendly abstractions found in more mature libraries like Hugging Face's diffusers. The README does not detail performance benchmarks or specific hardware requirements beyond general PyTorch compatibility.

pytorch-stable-diffusion by hkproj

Explore Similar Projects

Min-SNR-Diffusion-Training by TiankaiHang

oft by zqiu24

cycle-diffusion by ChenWu98

Compositional-Visual-Generation-with-Composable-Diffusion-Models-PyTorch by energy-based-model

hart by mit-han-lab

stable-diffusion-aesthetic-gradients by vicgalle

stable-diffusion-pytorch by kjsman

unidiffuser by thu-ml

text2image-gui by n00mkrad

Kandinsky-2 by ai-forever

Dreambooth-Stable-Diffusion by XavierXiao

latent-diffusion by CompVis