Reference library for Stable Diffusion 3.5 inference
Top 32.1% on sourcepulse
This repository provides a reference implementation for Stable Diffusion 3.5 (SD3.5) and SD3, enabling simple inference. It targets developers and researchers needing to integrate SD3.5 capabilities into their applications, offering a foundational code library for text encoders, VAE decoder, and the novel MM-DiT architecture.
How It Works
The implementation leverages a new MM-DiT (Multi-Modal Diffusion Transformer) architecture, a departure from previous diffusion models. It incorporates multiple public text encoders: OpenAI CLIP-L/14, OpenCLIP bigG, and Google T5-XXL. A 16-channel VAE decoder, similar to prior SD models but without a post-quantization convolution step, is also included. This combination aims for efficient and high-quality image generation.
Quick Start & Requirements
python3 -m venv .sd3.5
, source .sd3.5/bin/activate
) and install dependencies (python3 -m pip install -r requirements.txt
).models
directory. ControlNet weights are optional.python3 sd3_infer.py
with specified prompts and model paths. Example: python3 sd3_infer.py --prompt "cute wallpaper art of a cat" --model models/sd3.5_large.safetensors
.Highlighted Details
skip_layer_cfg
option for SD3.5-Medium for potentially improved structure.Maintenance & Community
The code originates from Stability AI's internal research, public repositories, and contributions from Alex Goodwin and Vikram Voleti. Some code is adapted from ComfyUI's internal Stability implementation and HuggingFace.
Licensing & Compatibility
The code is licensed under the MIT License. Some code originating from HuggingFace is subject to the Apache2 License. This permissive licensing generally allows for commercial use and integration into closed-source projects.
Limitations & Caveats
This repository is described as a "tiny reference implementation" and excludes model weights. While it supports various SD3.5 variants, it is primarily for inference and may not cover all advanced features or training capabilities.
6 months ago
1 week