piFlow by Lakonik

Policy-based flow models for fast, high-quality image generation

Created 4 months ago

278 stars

Top 93.5% on SourcePulse

Project Summary

Summary

pi-Flow presents a novel policy-based flow model framework designed for efficient, few-step generative tasks, targeting researchers and practitioners in generative AI. It accelerates image generation by outputting a fast policy that guides ODE substeps, enabling high-quality, diverse outputs with minimal inference steps while maintaining faithfulness to teacher models.

How It Works

The core innovation is pi-Flow's policy-based approach, where the network predicts a policy rather than a direct denoised state. This policy orchestrates multiple ODE substeps for generation. It employs policy-based imitation distillation (pi-ID), a simplified training method using only an L2 loss against a teacher model, eschewing complex techniques like JVPs or GANs. This design effectively balances quality and diversity, excels at fine-grained texture generation, and scales to large text-to-image models.

Quick Start & Requirements

Installation requires cloning the repository and running pip install -e . --no-build-isolation within a Python 3.10 conda environment. Key prerequisites include PyTorch 2.6 (specific version noted), Linux OS (Ubuntu 20+), and ninja. Accessing FLUX models necessitates huggingface-cli login. Official demos are available on HuggingFace Spaces for pi-Qwen, pi-FLUX, and pi-FLUX.2.

Highlighted Details

Enables 4-step generation for Qwen-Image and FLUX models, with elastic inference for Qwen-Image and pi-FLUX.2.
Scales effectively from ImageNet DiT to 20-billion-parameter models like Qwen-Image.
Built on the high-performance LakonLab codebase, featuring optimized distributed training (DDP, FSDP, FSDP2), mixed precision, and weight tying.
Supports advanced flow solvers and flexible storage backends (local, S3, HuggingFace, HTTP/S).
Demonstrates superior performance in generating fine-grained texture details and mitigating the quality-diversity trade-off.

Maintenance & Community

The project is associated with authors from Stanford University and Adobe Research. No specific community channels (e.g., Discord, Slack) or roadmap details are provided in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the README, which presents a significant ambiguity for adoption, particularly for commercial use. Windows compatibility is untested.

Limitations & Caveats

The primary limitation is the unstated license, posing a barrier to commercial adoption. Windows support is not guaranteed due to lack of testing. The specified PyTorch 2.6 version may require a specific or future environment setup.

piFlow by Lakonik

Explore Similar Projects

NextFlow by ByteVisionLab

UltraPixel by catcathh

Kandinsky-3 by ai-forever

peacasso by victordibia

InstaFlow by gnobitab

GLM-Image by zai-org

stable-diffusion-pytorch by kjsman

RPG-DiffusionMaster by YangLing0818

HunyuanImage-3.0 by Tencent-Hunyuan

consistency_models by openai

guided-diffusion by openai

latent-diffusion by CompVis