One-step image generator using Rectified Flow (ICLR 2024)
Top 30.6% on sourcepulse
InstaFlow offers ultra-fast, one-step text-to-image generation, achieving quality comparable to Stable Diffusion while drastically reducing inference time. It is designed for researchers and users seeking efficient high-quality image synthesis, leveraging the Rectified Flow technique.
How It Works
InstaFlow utilizes Rectified Flow, which trains probability flows with straight trajectories. This inherent linearity allows for direct mapping from noise to images in a single step, bypassing the iterative sampling required by traditional diffusion models. The process involves generating text-image triplets from pre-trained Stable Diffusion, applying text-conditioned reflow to straighten the generative flow, and then distilling this straight flow into a one-step model.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
The project is associated with ICLR 2024 and has contributions from researchers at multiple institutions. Updates include extensions to text-to-3D and image editing, as well as new few-step models. The project relies heavily on the 🤗 Diffusers library.
Licensing & Compatibility
The project's README does not explicitly state a license. However, it mentions that training scripts are modified from Diffusers examples, which typically use Apache 2.0. Compatibility for commercial use is not specified.
Limitations & Caveats
The training process for InstaFlow-0.9B required 199 A100 GPU days, indicating a significant computational cost for training custom models. While one-step generation is highlighted, compatibility with specific pre-trained models or fine-tuning techniques may require further verification.
1 year ago
1 week