glid-3-xl-stable by Jack000

Stable diffusion training/fine-tuning codebase

Created 3 years ago

295 stars

Top 89.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

Taranjeet Singh

Cofounder of Mem0

Project Summary

This repository provides a Stable Diffusion implementation back-ported to OpenAI's guided diffusion codebase, enabling easier development and training of diffusion models. It targets researchers and developers interested in advanced image generation techniques like inpainting, outpainting, classifier guidance, and super-resolution.

How It Works

The project leverages the established architecture of OpenAI's guided diffusion, integrating Stable Diffusion's components. This approach allows for modularity and extensibility, facilitating experimentation with various sampling methods, guidance scales, and model configurations. The use of MPI for distributed training enables efficient scaling across multiple GPUs.

Quick Start & Requirements

Install: Clone the repository, install Latent Diffusion (pip install latent-diffusion), then install this project (pip install -e .). Install MPI and mpi4py (sudo apt install libopenmpi-dev, pip install mpi4py).
Model Files: Download diffusion.pt and kl.pt from Hugging Face or split a Stable Diffusion checkpoint using split.py.
Prerequisites: Python, PyTorch, MPI, mpi4py, PyQt5 (for GUI).
Resources: Generating images requires minimal VRAM. Upscaling requires ~11GB VRAM. Training requires significant VRAM (48GB+ for general training, 80GB+ for large upscale models).
Docs: Hugging Face Repo

Highlighted Details

Supports inpainting, outpainting, classifier-guided generation, and super-resolution.
Offers a GUI for inpainting via PyQt5.
Includes scripts for training custom classifiers and fine-tuning models.
Allows merging checkpoints for compatibility with other Stable Diffusion tools.

Maintenance & Community

The repository appears to be a personal project by Jack000. No specific community channels or active development signals are present in the README.

Licensing & Compatibility

The README does not explicitly state a license. The project depends on Latent Diffusion, which is typically under permissive licenses (e.g., MIT). Compatibility with commercial or closed-source projects would require license verification.

Limitations & Caveats

Training scripts specify high VRAM requirements, potentially limiting accessibility for users without high-end hardware. The project's reliance on the older OpenAI guided diffusion codebase might mean it misses recent advancements or optimizations found in more current Stable Diffusion implementations.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days