fast-pixel-cnn by PrajitR

PixelCNN++ speedup via caching for real-time image generation

Created 9 years ago

482 stars

Top 63.6% on SourcePulse

View on GitHub

2 Experts Love This Project

Aravind Srinivas

Cofounder of Perplexity

Deshraj Yadav

Cofounder of Mem0

Project Summary

This repository provides a method to significantly accelerate the image generation process of PixelCNN++ models. It targets researchers and practitioners working with autoregressive generative models who need faster inference times, achieving up to a 183x speedup over naive implementations.

How It Works

The core innovation is caching previously computed hidden states within the convolutional layers. Naive PixelCNN++ implementations recompute these states redundantly for each generated pixel. This approach intelligently reuses computations by maintaining a queue-like cache for each layer, proportional to its dilation factor. For layers with strided convolutions (downsampling/upsampling), a cache_every parameter is introduced to manage cache updates, ensuring efficiency without sacrificing accuracy. This caching strategy minimizes redundant computations, leading to substantial speed gains.

Quick Start & Requirements

Install TensorFlow 1.0, NumPy, and Matplotlib.
Download OpenAI's pre-trained PixelCNN++ model (params_cifar.ckpt).
Run: CUDA_VISIBLE_DEVICES=0 python generate.py --checkpoint=/path/to/params_cifar.ckpt --save_dir=/path/to/save/generated/images
Requires a Tesla K40 GPU for tested performance.

Highlighted Details

Achieves up to 183x speedup for 32x32 image generation.
Enables real-time generation of multiple images.
Caching mechanism generalizes to dilated and strided convolutions.
Applicable to other convolutional autoregressive models like Video Pixel Networks.

Maintenance & Community

The project is authored by Prajit Ramachandran, Tom Le Paine, Pooya Khorrami, and Mohammad Babaeizadeh. A citation is provided for the associated paper: "Fast Generation for Convolutional Autoregressive Models" (arXiv:1704.06001).

Licensing & Compatibility

The repository does not explicitly state a license. The underlying PixelCNN++ model is typically associated with permissive licenses, but this specific implementation's licensing is not detailed.

Limitations & Caveats

The code is tested with Python 3 and TensorFlow 1.0, and may require modifications for other versions. The performance claims are based on a Tesla K40 GPU, and results may vary on different hardware.

Health Check

Last Commit

8 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days