improved-diffusion  by openai

Image diffusion codebase for research

Created 4 years ago
3,662 stars

Top 13.2% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides the codebase for improved denoising diffusion probabilistic models, enabling researchers and practitioners to train and sample high-quality images. It offers implementations for various diffusion objectives, noise schedules, and conditional generation, with pre-trained models available for ImageNet, CIFAR-10, and LSUN datasets.

How It Works

The project implements diffusion models that learn to reverse a gradual noising process. It supports different noise schedules (linear, cosine) and diffusion objectives (e.g., L_hybrid, L_vlb) to optimize sample quality and training stability. The architecture utilizes U-Net style networks with optional features like learned sigmas, class conditioning, and attention mechanisms for enhanced performance.

Quick Start & Requirements

  • Install via pip install -e .
  • Requires Python and PyTorch.
  • Data preparation involves organizing images into directories; specific scripts are provided for ImageNet, LSUN bedrooms, and CIFAR-10.
  • Training and sampling scripts accept hyperparameter flags for model architecture, diffusion process, and training configurations.
  • Official checkpoints and detailed run flags for various configurations are available.

Highlighted Details

  • Supports class-conditional generation and upsampling models.
  • Offers implementations for both L_hybrid and L_vlb diffusion objectives.
  • Provides configurations for linear and cosine noise schedules.
  • Includes pre-trained checkpoints for ImageNet (64x64), CIFAR-10 (32x32), and LSUN (256x256).

Maintenance & Community

This project is from OpenAI. Specific community channels or active maintenance status are not detailed in the README.

Licensing & Compatibility

The repository is released under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

Training these models is computationally intensive and requires significant GPU resources, often necessitating distributed training setups (e.g., using MPI). Batch sizes specified in the README are for single-GPU training, and users may need to adjust --batch_size or use --microbatch for memory-constrained environments.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
29 stars in the last 30 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), and
1 more.

cycle-diffusion by ChenWu98

0%
640
PyTorch code for diffusion model latent space research paper
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.