InstaFlow by gnobitab

One-step image generator using Rectified Flow (ICLR 2024)

Created 2 years ago

1,386 stars

Top 29.0% on SourcePulse

Project Summary

InstaFlow offers ultra-fast, one-step text-to-image generation, achieving quality comparable to Stable Diffusion while drastically reducing inference time. It is designed for researchers and users seeking efficient high-quality image synthesis, leveraging the Rectified Flow technique.

How It Works

InstaFlow utilizes Rectified Flow, which trains probability flows with straight trajectories. This inherent linearity allows for direct mapping from noise to images in a single step, bypassing the iterative sampling required by traditional diffusion models. The process involves generating text-image triplets from pre-trained Stable Diffusion, applying text-conditioned reflow to straighten the generative flow, and then distilling this straight flow into a one-step model.

Quick Start & Requirements

Install/Run: Pre-trained models and inference code are available. A Colab notebook is provided for easy experimentation.
Prerequisites: Requires a GPU (A100 mentioned for benchmarks). Compatibility with LoRAs and ControlNets is supported. ONNX support is also available.
Resources: Inference time is approximately 0.1 seconds on an A100 GPU.
Links: Paper, Hugging Face Demo, Colab Notebook

Highlighted Details

Achieves ~90% inference time reduction compared to standard Stable Diffusion.
Generates images with FID scores comparable to state-of-the-art GANs like StyleGAN-T.
Compatible with pre-trained LoRAs and ControlNets.
Offers ONNX support for broader deployment.

Maintenance & Community

The project is associated with ICLR 2024 and has contributions from researchers at multiple institutions. Updates include extensions to text-to-3D and image editing, as well as new few-step models. The project relies heavily on the 🤗 Diffusers library.

Licensing & Compatibility

The project's README does not explicitly state a license. However, it mentions that training scripts are modified from Diffusers examples, which typically use Apache 2.0. Compatibility for commercial use is not specified.

Limitations & Caveats

The training process for InstaFlow-0.9B required 199 A100 GPU days, indicating a significant computational cost for training custom models. While one-step generation is highlighted, compatibility with specific pre-trained models or fine-tuning techniques may require further verification.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

8 stars in the last 30 days