pytorch-stable-diffusion  by hkproj

PyTorch code for Stable Diffusion image generation

created 1 year ago
929 stars

Top 40.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a PyTorch implementation of Stable Diffusion, a text-to-image generation model. It is designed for researchers and developers who want to understand and experiment with the core components of Stable Diffusion without relying on higher-level libraries. The project allows users to generate images from text prompts using pre-trained Stable Diffusion models.

How It Works

The implementation is built from scratch in PyTorch, offering a direct look at the model's architecture and forward pass. It leverages pre-trained weights and tokenizer files, specifically mentioning compatibility with Stable Diffusion v1.5 and various fine-tuned models. The core functionality involves loading these components and processing text prompts to generate corresponding images.

Quick Start & Requirements

  • Install: Clone the repository and download necessary model weights (v1-5-pruned-emaonly.ckpt) and tokenizer files (vocab.json, merges.txt) into a data folder.
  • Prerequisites: PyTorch, Python. Specific versions are not stated but implied to be compatible with standard PyTorch installations.
  • Resources: Requires downloading model checkpoints, which can be several gigabytes.

Highlighted Details

  • Implemented from scratch in PyTorch for educational and research purposes.
  • Supports Stable Diffusion v1.5 and various fine-tuned models (e.g., InkPunk Diffusion, Illustration Diffusion).
  • Provides direct access to model components for deeper understanding.

Maintenance & Community

The project acknowledges several other Stable Diffusion implementations, including those from CompVis, divamgupta, kjsman, and Hugging Face's diffusers library, suggesting community awareness and potential for integration or comparison. No specific community channels or active maintenance signals are present in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking is therefore undetermined.

Limitations & Caveats

This implementation is a foundational PyTorch port and may lack the optimizations, features, or user-friendly abstractions found in more mature libraries like Hugging Face's diffusers. The README does not detail performance benchmarks or specific hardware requirements beyond general PyTorch compatibility.

Health Check
Last commit

9 months ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
96 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Travis Fischer Travis Fischer(Founder of Agentic), and
3 more.

consistency_models by openai

0.0%
6k
PyTorch code for consistency models research paper
created 2 years ago
updated 1 year ago
Starred by Tim J. Baek Tim J. Baek(Founder of Open WebUI), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
7 more.

pytorch-tutorial by yunjey

0.1%
32k
PyTorch tutorial for deep learning researchers
created 8 years ago
updated 1 year ago
Starred by Dan Abramov Dan Abramov(Core Contributor to React), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
28 more.

stable-diffusion by CompVis

0.1%
71k
Latent text-to-image diffusion model
created 3 years ago
updated 1 year ago
Feedback? Help us improve.