minRF  by cloneofsimo

Minimal implementation of rectified flow transformers, based on SD3

created 1 year ago
602 stars

Top 55.1% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a minimal, self-contained implementation of Rectified Flow (RF) transformers, inspired by the SD3 approach and LLaMA-DiT architecture. It's designed for researchers and practitioners looking to understand and experiment with scalable RF models, offering simplified code for beginners and advanced features for more experienced users.

How It Works

The project implements Rectified Flow, a diffusion model variant that learns a direct mapping between data distributions by solving an ordinary differential equation (ODE). It utilizes a logit-normal time-sampling strategy for improved training efficiency and scalability, drawing architectural inspiration from LLaMA-DiT. The code is structured to be easily understandable and modifiable, separating model implementation from training logic.

Quick Start & Requirements

  • MNIST Training: pip install torch torchvision pillow then python rf.py
  • CIFAR Training: python rf.py --cifar
  • ImageNet Training (Advanced): Requires hf_transfer (pip install hf_transfer), download dataset via cd advanced && bash download.sh, then run bash run.sh.
  • Prerequisites: PyTorch, Pillow, hf_transfer for ImageNet.

Highlighted Details

  • Minimal, self-contained implementation for ease of understanding and modification.
  • Supports training on MNIST, CIFAR, and ImageNet datasets.
  • Advanced section includes muP (model parallelism) grid search for ImageNet training, enabling zero-shot LR transfer.
  • Integrates techniques from min-max-IN-dit, min-max-gpt, and ez-muP.

Maintenance & Community

The project is maintained by Simo Ryu. Further community or roadmap information is not detailed in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Users should verify licensing for commercial or closed-source use.

Limitations & Caveats

The "Massive Rectified Flow" section is marked for "gigachads" and requires downloading a custom ImageNet dataset, suggesting a higher barrier to entry for advanced features. The project is presented as a minimal implementation, implying potential missing features or optimizations found in more comprehensive RF libraries.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
89 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.