DALLE-mtf  by EleutherAI

DALL-E implementation for large-scale training

Created 4 years ago
434 stars

Top 68.5% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides an implementation of OpenAI's DALL-E model within the Mesh-TensorFlow framework, targeting large-scale training. It aims to enable researchers and practitioners to train models comparable to or larger than the original 12 billion parameter DALL-E, with a focus on efficient distributed training.

How It Works

The project leverages Mesh-TensorFlow for distributed training, allowing for efficient scaling across multiple accelerators. It follows the DALL-E architecture, which involves a VAE (Variational Autoencoder) to compress images into discrete tokens and a transformer model to generate these tokens conditioned on text. The Mesh-TensorFlow framework facilitates the partitioning of model and data across a TPU mesh, enabling training of very large models.

Quick Start & Requirements

  • Install: Clone the repository and install dependencies via pip3 install -r requirements.txt.
  • Prerequisites: Requires Google Cloud Platform account, a storage bucket, and TPUs. Untested on GPUs but theoretically supported.
  • Setup: Use ctpu up --vm-only to create a VM connected to your GCP resources.
  • Documentation: Configuration details for VAE and DALL-E training are provided in the README.

Highlighted Details

  • Implements DALL-E architecture in Mesh-TensorFlow for large-scale training.
  • Includes a VAE pretraining pipeline for image tokenization.
  • Supports custom dataset formatting with JSONL files and image directories.
  • Configuration examples for VAE and DALL-E models are detailed.

Maintenance & Community

This project is from EleutherAI, a research collective focused on open-source AI. Specific contributor details beyond Ben Wang and Aran Komatsuzaki are not explicitly listed.

Licensing & Compatibility

The repository's license is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking would require clarification of the license.

Limitations & Caveats

The project is marked as "[WIP]" (Work In Progress). No pre-trained models are available yet. Training is primarily designed for TPUs, with GPU support being theoretical. A public, large-scale dataset for DALL-E training is still in development.

Health Check
Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), Hanlin Tang Hanlin Tang(CTO Neural Networks at Databricks; Cofounder of MosaicML), and
1 more.

diffusion by mosaicml

0%
707
Diffusion model training code
Created 2 years ago
Updated 8 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
5 more.

ai-toolkit by ostris

0.9%
6k
Training toolkit for finetuning diffusion models
Created 2 years ago
Updated 14 hours ago
Feedback? Help us improve.