piFlow  by Lakonik

Policy-based flow models for fast, high-quality image generation

Created 4 months ago
278 stars

Top 93.5% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

pi-Flow presents a novel policy-based flow model framework designed for efficient, few-step generative tasks, targeting researchers and practitioners in generative AI. It accelerates image generation by outputting a fast policy that guides ODE substeps, enabling high-quality, diverse outputs with minimal inference steps while maintaining faithfulness to teacher models.

How It Works

The core innovation is pi-Flow's policy-based approach, where the network predicts a policy rather than a direct denoised state. This policy orchestrates multiple ODE substeps for generation. It employs policy-based imitation distillation (pi-ID), a simplified training method using only an L2 loss against a teacher model, eschewing complex techniques like JVPs or GANs. This design effectively balances quality and diversity, excels at fine-grained texture generation, and scales to large text-to-image models.

Quick Start & Requirements

Installation requires cloning the repository and running pip install -e . --no-build-isolation within a Python 3.10 conda environment. Key prerequisites include PyTorch 2.6 (specific version noted), Linux OS (Ubuntu 20+), and ninja. Accessing FLUX models necessitates huggingface-cli login. Official demos are available on HuggingFace Spaces for pi-Qwen, pi-FLUX, and pi-FLUX.2.

Highlighted Details

  • Enables 4-step generation for Qwen-Image and FLUX models, with elastic inference for Qwen-Image and pi-FLUX.2.
  • Scales effectively from ImageNet DiT to 20-billion-parameter models like Qwen-Image.
  • Built on the high-performance LakonLab codebase, featuring optimized distributed training (DDP, FSDP, FSDP2), mixed precision, and weight tying.
  • Supports advanced flow solvers and flexible storage backends (local, S3, HuggingFace, HTTP/S).
  • Demonstrates superior performance in generating fine-grained texture details and mitigating the quality-diversity trade-off.

Maintenance & Community

The project is associated with authors from Stanford University and Adobe Research. No specific community channels (e.g., Discord, Slack) or roadmap details are provided in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the README, which presents a significant ambiguity for adoption, particularly for commercial use. Windows compatibility is untested.

Limitations & Caveats

The primary limitation is the unstated license, posing a barrier to commercial adoption. Windows support is not guaranteed due to lack of testing. The specified PyTorch 2.6 version may require a specific or future environment setup.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
11 stars in the last 30 days

Explore Similar Projects

Starred by Benjamin Bolte Benjamin Bolte(Cofounder of K-Scale Labs), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
10 more.

consistency_models by openai

0.1%
6k
PyTorch code for consistency models research paper
Created 3 years ago
Updated 2 years ago
Feedback? Help us improve.