PyTorch code for universal diffusion guidance
Top 63.7% on sourcepulse
This repository provides a PyTorch implementation of Universal Guidance for Diffusion Models, enabling control over image generation using arbitrary modalities like human identity, segmentation maps, object locations, and style without retraining. It targets researchers and developers working with diffusion models who need flexible conditioning beyond text prompts.
How It Works
The core approach modifies the diffusion process to incorporate guidance signals alongside text conditioning. It leverages Stable Diffusion and OpenAI's ImageNet Diffusion Model, allowing for control via forward and backward guidance mechanisms. This method avoids the need for task-specific model retraining, offering a generalized framework for diverse conditioning inputs.
Quick Start & Requirements
conda env create -f environment.yaml
, conda activate ldm
, conda install pytorch torchvision cudatoolkit=11.3 -c pytorch
, pip install GPUtil facenet-pytorch blobfile
.sd-v1-4.ckpt
), and OpenAI's ImageNet Diffusion Model.Highlighted Details
Maintenance & Community
No specific information on maintainers, community channels, or roadmap is provided in the README.
Licensing & Compatibility
The README does not explicitly state a license. The presence of Stable Diffusion and OpenAI models implies adherence to their respective licenses. Commercial use compatibility is not specified.
Limitations & Caveats
The repository requires specific model checkpoints (Stable Diffusion and OpenAI's ImageNet Diffusion Model) to be downloaded separately. The README does not detail performance benchmarks or specific hardware requirements beyond CUDA.
2 years ago
Inactive