vdvae  by openai

Research paper implementation for very deep VAE models

created 4 years ago
446 stars

Top 68.4% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the implementation for "Very Deep VAEs," a generative model that generalizes autoregressive models and achieves state-of-the-art performance on image generation tasks. It is targeted at researchers and practitioners in deep learning and computer vision looking to explore advanced generative modeling techniques.

How It Works

The VDVAE architecture employs a deep, hierarchical structure with a large number of layers, enabling it to capture complex image distributions. It utilizes a variational autoencoder framework with a novel approach to depth and parameter sharing, allowing for efficient learning of high-dimensional data. This design allows the model to outperform traditional autoregressive models in terms of sample quality and likelihood.

Quick Start & Requirements

  • Install: Clone the repository and install dependencies including NVIDIA Apex.
  • Prerequisites: PyTorch 1.6, CUDA 10.1, Numpy 1.16, Ubuntu 18.04, V100 GPUs.
  • Data: Download datasets using provided setup scripts (setup_cifar10.sh, setup_imagenet.sh, setup_ffhq256.sh, setup_ffhq1024.sh). FFHQ dataset requires manual download of images_1024x1024 subfolder.
  • Training: Use mpiexec for distributed training (e.g., mpiexec -n 2 python train.py --hps cifar10).
  • Restoring Models: Download pre-trained checkpoints and use train.py with --restore_path and other restore arguments.
  • Links: Paper: https://arxiv.org/abs/2011.10650

Highlighted Details

  • Achieves state-of-the-art performance on image generation benchmarks.
  • Models range from 39M to 125M parameters.
  • Training on large datasets like ImageNet and FFHQ requires significant GPU resources (e.g., 32 V100s for 2.5 weeks).
  • Provides pre-trained checkpoints for CIFAR-10, ImageNet (32x32, 64x64), and FFHQ (256x256, 1024x1024).

Maintenance & Community

  • Developed by OpenAI.
  • No explicit community links (Discord/Slack) or roadmap are provided in the README.

Licensing & Compatibility

  • The repository does not explicitly state a license. Based on OpenAI's typical practices, it is likely intended for research purposes.

Limitations & Caveats

  • Requires specific older versions of PyTorch and CUDA, and NVIDIA Apex, which may pose installation challenges.
  • Training from scratch is computationally intensive, requiring substantial GPU resources and time.
  • The FFHQ dataset setup requires manual data acquisition.
Health Check
Last commit

2 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Travis Fischer Travis Fischer(Founder of Agentic), and
3 more.

consistency_models by openai

0.0%
6k
PyTorch code for consistency models research paper
created 2 years ago
updated 1 year ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Ross Taylor Ross Taylor(Cofounder of General Reasoning; Creator of Papers with Code), and
3 more.

pixel-cnn by openai

0.1%
2k
TensorFlow implementation for PixelCNN++ research paper
created 9 years ago
updated 5 years ago
Starred by Lilian Weng Lilian Weng(Cofounder of Thinking Machines Lab), Patrick Kidger Patrick Kidger(Core Contributor to JAX ecosystem), and
4 more.

glow by openai

0.1%
3k
Generative flow research paper code
created 7 years ago
updated 1 year ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
3 more.

guided-diffusion by openai

0.2%
7k
Image synthesis codebase for diffusion models
created 4 years ago
updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), and
4 more.

taming-transformers by CompVis

0.1%
6k
Image synthesis research paper using transformers
created 4 years ago
updated 1 year ago
Feedback? Help us improve.