distrifuser  by mit-han-lab

Research paper for distributed parallel inference of high-resolution diffusion models

Created 1 year ago
705 stars

Top 48.4% on SourcePulse

GitHubView on GitHub
Project Summary

DistriFusion addresses the challenge of accelerating high-resolution diffusion model inference across multiple GPUs without compromising image quality. It is designed for researchers and practitioners working with large-scale generative models who need to reduce latency and improve throughput for tasks like text-to-image generation.

How It Works

DistriFusion employs a training-free distributed inference strategy by partitioning the image generation process across multiple GPUs. It utilizes a novel synchronous communication approach for patch interaction in the initial step, followed by asynchronous communication to reuse activations from previous steps. This technique effectively hides communication overhead within the computation pipeline, enabling significant speedups.

Quick Start & Requirements

  • Installation: pip install distrifuser or pip install git+https://github.com/mit-han-lab/distrifuser.git
  • Prerequisites: Python 3, NVIDIA GPU with CUDA >= 12.0, CuDNN, PyTorch = 2.2.
  • Usage: Run scripts with torchrun --nproc_per_node=$N_GPUS scripts/sdxl_example.py.
  • Resources: Requires multiple NVIDIA GPUs.
  • Links: Project, Paper, Blog, Slides, Youtube, Poster

Highlighted Details

  • Achieves 1.8x, 3.4x, and 6.1x speedups on 2, 4, and 8 A100 GPUs respectively for 3840x3840 SDXL generation.
  • Preserves visual fidelity and quality, validated by FID scores.
  • Offers APIs compatible with Hugging Face's diffusers library.
  • Integrated into NVIDIA TensorRT-LLM and supported by ColossalAI.

Maintenance & Community

  • The project is associated with MIT and NVIDIA researchers.
  • Code is publicly available and actively updated.
  • Supports SDXL, SD1.4, and SD2 models.

Licensing & Compatibility

  • The repository does not explicitly state a license in the README.
  • Code is based on Hugging Face diffusers and lmxyy/sige.

Limitations & Caveats

  • Requires multiple high-end NVIDIA GPUs and specific CUDA/PyTorch versions, limiting accessibility.
  • The README does not specify licensing, which could impact commercial use.
Health Check
Last Commit

9 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Alex Yu Alex Yu(Research Scientist at OpenAI; Former Cofounder of Luma AI) and Cody Yu Cody Yu(Coauthor of vLLM; MTS at OpenAI).

xDiT by xdit-project

0.7%
2k
Inference engine for parallel Diffusion Transformer (DiT) deployment
Created 1 year ago
Updated 1 day ago
Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Rodrigo Nader Rodrigo Nader(Cofounder of Langflow), and
1 more.

DiffSynth-Studio by modelscope

0.9%
10k
Open-source project for diffusion model exploration
Created 1 year ago
Updated 13 hours ago
Feedback? Help us improve.