image-gpt  by teddykoker

PyTorch implementation of Image GPT research paper

Created 5 years ago
258 stars

Top 98.2% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a PyTorch implementation of OpenAI's Image GPT (iGPT), a generative model that treats images as sequences of pixels. It aims to reproduce the results from the "Generative Pretraining from Pixels" paper, enabling users to train and sample from image generation models.

How It Works

The core approach quantizes images into discrete tokens using k-means clustering, then applies a GPT-like transformer architecture to model the pixel sequence autoregressively. This allows for generative pre-training and subsequent fine-tuning for tasks like classification. The advantage lies in leveraging the proven success of transformer architectures for sequential data in the image domain.

Quick Start & Requirements

  • Install via pip.
  • Requires PyTorch.
  • GPU (NVIDIA 2070 mentioned for Fashion-MNIST training) and CUDA are recommended for reasonable training times.
  • Download pre-trained models using ./download.sh.
  • Official quick-start and usage examples are provided within the README.

Highlighted Details

  • Reproduces iGPT-S architecture and training methodology.
  • Supports generative pre-training and classification fine-tuning.
  • Includes scripts for computing centroids, training, sampling, and generating GIFs.
  • Demonstrates training a small model (26K parameters) on Fashion-MNIST in under 2 hours on a single NVIDIA 2070.

Maintenance & Community

The project is marked as "WIP" (Work In Progress). No specific community channels or notable contributors are mentioned in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is a work in progress with several planned features yet to be implemented, including batched k-means on GPU, BERT-style pre-training, and loading OpenAI's official pre-trained models. Reproducing iGPT-S results is a stated goal but may require significant compute resources.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.