guided-diffusion  by openai

Image synthesis codebase for diffusion models

created 4 years ago
6,958 stars

Top 7.4% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the codebase for guided diffusion models, building upon openai/improved-diffusion with enhancements for classifier conditioning and architectural improvements. It enables users to generate high-fidelity images through diffusion processes, offering class-conditional and unconditional sampling, as well as super-resolution capabilities. The project is primarily aimed at researchers and practitioners in generative AI and computer vision.

How It Works

The project implements diffusion models, a class of generative models that learn to reverse a diffusion process that gradually adds noise to data. This codebase specifically focuses on classifier guidance, where a pre-trained classifier is used during the sampling process to steer the generation towards specific classes, improving sample quality and class adherence. It supports various architectures and noise schedules, including cosine and linear schedules, and incorporates techniques like attention mechanisms and scale-shift normalization for enhanced performance.

Quick Start & Requirements

  • Install: Clone the repository and install dependencies via pip install -r requirements.txt.
  • Prerequisites: Python 3.x, PyTorch, NumPy, Pillow, SciPy, and potentially CUDA for GPU acceleration. Pre-trained model checkpoints (e.g., 64x64_diffusion.pt, 256x256_classifier.pt) must be downloaded separately.
  • Setup: Requires downloading model checkpoints and setting up a models/ directory. Sampling commands are provided for various resolutions and configurations.
  • Links: Official Paper

Highlighted Details

  • Achieves state-of-the-art FID scores on ImageNet, outperforming GANs in image synthesis quality.
  • Supports class-conditional generation, enabling control over the generated image content.
  • Includes super-resolution models for upscaling lower-resolution generated images.
  • Provides example scripts for sampling from pre-trained models and training custom classifiers.

Maintenance & Community

This project is maintained by OpenAI. Further community interaction details are not explicitly provided in the README.

Licensing & Compatibility

The repository is released under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The codebase is primarily focused on research and may require significant computational resources (GPU, memory) for training and sampling, especially at higher resolutions. The README does not detail specific version requirements for dependencies beyond Python 3.x.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
215 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Travis Fischer Travis Fischer(Founder of Agentic), and
3 more.

consistency_models by openai

0.0%
6k
PyTorch code for consistency models research paper
created 2 years ago
updated 1 year ago
Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), and
2 more.

glide-text2im by openai

0.1%
4k
Text-conditional image synthesis model from research paper
created 3 years ago
updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), and
4 more.

taming-transformers by CompVis

0.1%
6k
Image synthesis research paper using transformers
created 4 years ago
updated 1 year ago
Feedback? Help us improve.