Local text-to-image diffusion using CLIP guidance
Top 75.2% on sourcepulse
This repository provides a local implementation of CLIP-guided diffusion for text-to-image generation, enabling users to bypass cloud-based Colab notebooks. It targets researchers and hobbyists interested in AI art generation and offers a flexible way to experiment with diffusion models and CLIP for creative image synthesis.
How It Works
The project leverages OpenAI's guided diffusion models (available in 256x256 and 512x512 resolutions) and the CLIP model to connect text prompts with generated images. It allows for weighted and multiple text prompts, as well as image prompts, to guide the diffusion process. The approach offers fine-grained control over generation parameters like guidance scale, diffusion steps, and smoothness.
Quick Start & Requirements
conda create --name cgd python=3.9
, conda activate cgd
, git clone ...
, cd CLIP-Guided-Diffusion
, ./setup.sh
or manual commands.Highlighted Details
clip_guidance_scale
, tv_scale
, diffusion_steps
).Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project is described as "just playing," suggesting it may not be production-ready. Windows compatibility is untested. The setup requires specific older versions of PyTorch and CUDA, which might conflict with other environments.
2 years ago
1 week