distrifuser  by mit-han-lab

Research paper for distributed parallel inference of high-resolution diffusion models

created 1 year ago
698 stars

Top 49.8% on sourcepulse

GitHubView on GitHub
Project Summary

DistriFusion addresses the challenge of accelerating high-resolution diffusion model inference across multiple GPUs without compromising image quality. It is designed for researchers and practitioners working with large-scale generative models who need to reduce latency and improve throughput for tasks like text-to-image generation.

How It Works

DistriFusion employs a training-free distributed inference strategy by partitioning the image generation process across multiple GPUs. It utilizes a novel synchronous communication approach for patch interaction in the initial step, followed by asynchronous communication to reuse activations from previous steps. This technique effectively hides communication overhead within the computation pipeline, enabling significant speedups.

Quick Start & Requirements

  • Installation: pip install distrifuser or pip install git+https://github.com/mit-han-lab/distrifuser.git
  • Prerequisites: Python 3, NVIDIA GPU with CUDA >= 12.0, CuDNN, PyTorch = 2.2.
  • Usage: Run scripts with torchrun --nproc_per_node=$N_GPUS scripts/sdxl_example.py.
  • Resources: Requires multiple NVIDIA GPUs.
  • Links: Project, Paper, Blog, Slides, Youtube, Poster

Highlighted Details

  • Achieves 1.8x, 3.4x, and 6.1x speedups on 2, 4, and 8 A100 GPUs respectively for 3840x3840 SDXL generation.
  • Preserves visual fidelity and quality, validated by FID scores.
  • Offers APIs compatible with Hugging Face's diffusers library.
  • Integrated into NVIDIA TensorRT-LLM and supported by ColossalAI.

Maintenance & Community

  • The project is associated with MIT and NVIDIA researchers.
  • Code is publicly available and actively updated.
  • Supports SDXL, SD1.4, and SD2 models.

Licensing & Compatibility

  • The repository does not explicitly state a license in the README.
  • Code is based on Hugging Face diffusers and lmxyy/sige.

Limitations & Caveats

  • Requires multiple high-end NVIDIA GPUs and specific CUDA/PyTorch versions, limiting accessibility.
  • The README does not specify licensing, which could impact commercial use.
Health Check
Last commit

8 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
22 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.