maxdiffusion by AI-Hypercomputer

Jax diffusion models for training and inference

Created 2 years ago

296 stars

Top 89.6% on SourcePulse

View on GitHub

2 Experts Love This Project

Jeff Hammerbacher

Cofounder of Cloudera

Matthew Johnson

Coauthor of JAX; Research Scientist at Google Brain

Project Summary

MaxDiffusion is a collection of pure Python/Jax implementations for latent diffusion models, optimized for XLA devices like Cloud TPUs and GPUs. It serves as a research and production launching point for ambitious diffusion projects, enabling users to train, tune, and serve solutions with models like Stable Diffusion 2.x, XL, Flux, and LTX-Video.

How It Works

MaxDiffusion leverages Jax and XLA for high-performance, distributed computing across TPU pods. Its architecture is designed for scalability and efficiency, allowing for complex operations like fused attention (via Transformer Engine on GPUs) and multi-host training. The project supports various diffusion models and offers features like LoRA loading and ControlNet inference, providing a flexible foundation for advanced diffusion tasks.

Quick Start & Requirements

Installation: pip install -r requirements.txt, pip install . (after cloning). For GPU with fused attention: pip install -U "jax[cuda12]", pip install "transformer_engine[jax]".
Prerequisites: Ubuntu 22.04, Python 3.10, Tensorflow >= 2.12.0. For GPU fused attention: CUDA 12, Transformer Engine. For LTX-Video: PyTorch weights need conversion.
Setup: Specific instructions are provided for first-time users. Multi-host development requires TPU setup via gcloud.
Docs: Links to official quick-start, docs, and demo pages are not explicitly provided in the README, but the structure suggests comprehensive usage examples.

Highlighted Details

Supports training and inference for Stable Diffusion 2.x, XL, Flux (Dev, Schnell), and Lightning.
Includes Dreambooth training for Stable Diffusion 1.x and 2.x.
Features LTX-Video text-to-video and image-to-video generation.
Enables loading multiple LoRAs and Hyper-SDXL LoRA for inference.
Offers ControlNet inference for Stable Diffusion 1.4 and SDXL.

Maintenance & Community

The project is hosted on GitHub at AI-Hypercomputer/maxdiffusion. Community interaction details (Discord/Slack, etc.) are not specified in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility with Hugging Face Jax models is noted.

Limitations & Caveats

Flux finetuning and some LoRA formats have limited testing. Specific hardware configurations (e.g., TPU v5p for Flux finetuning) are recommended or tested. The README does not detail potential bus factors or provide a roadmap.

Health Check

Last Commit

13 hours ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

7 stars in the last 30 days