distrifuser by mit-han-lab

Research paper for distributed parallel inference of high-resolution diffusion models

Created 1 year ago

716 stars

Top 48.0% on SourcePulse

3 Experts Love This Project

philschmid

DevRel at Google DeepMind

xiezhq-hermann

Coauthor of SGLang

merrymercy

Coauthor of SGLang, vLLM

Project Summary

DistriFusion addresses the challenge of accelerating high-resolution diffusion model inference across multiple GPUs without compromising image quality. It is designed for researchers and practitioners working with large-scale generative models who need to reduce latency and improve throughput for tasks like text-to-image generation.

How It Works

DistriFusion employs a training-free distributed inference strategy by partitioning the image generation process across multiple GPUs. It utilizes a novel synchronous communication approach for patch interaction in the initial step, followed by asynchronous communication to reuse activations from previous steps. This technique effectively hides communication overhead within the computation pipeline, enabling significant speedups.

Quick Start & Requirements

Installation: pip install distrifuser or pip install git+https://github.com/mit-han-lab/distrifuser.git
Prerequisites: Python 3, NVIDIA GPU with CUDA >= 12.0, CuDNN, PyTorch = 2.2.
Usage: Run scripts with torchrun --nproc_per_node=$N_GPUS scripts/sdxl_example.py.
Resources: Requires multiple NVIDIA GPUs.
Links: Project, Paper, Blog, Slides, Youtube, Poster

Highlighted Details

Achieves 1.8x, 3.4x, and 6.1x speedups on 2, 4, and 8 A100 GPUs respectively for 3840x3840 SDXL generation.
Preserves visual fidelity and quality, validated by FID scores.
Offers APIs compatible with Hugging Face's diffusers library.
Integrated into NVIDIA TensorRT-LLM and supported by ColossalAI.

Maintenance & Community

The project is associated with MIT and NVIDIA researchers.
Code is publicly available and actively updated.
Supports SDXL, SD1.4, and SD2 models.

Licensing & Compatibility

The repository does not explicitly state a license in the README.
Code is based on Hugging Face diffusers and lmxyy/sige.

Limitations & Caveats

Requires multiple high-end NVIDIA GPUs and specific CUDA/PyTorch versions, limiting accessibility.
The README does not specify licensing, which could impact commercial use.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

4 stars in the last 30 days

Explore Similar Projects

fm-boosting by CompVis

Boosting latent diffusion models for high-resolution image synthesis

Created 2 years ago

Updated 2 months ago

MagCache by Zehong-Ma

Magnitude-aware caching for accelerated diffusion models

Created 7 months ago

Updated 1 month ago

DC-Gen by dc-ai-projects

Diffusion models for accelerated inference and high-res generation

Created 6 months ago

Updated 3 months ago

ComfyUI-MagCache by Zehong-Ma

Accelerating diffusion model inference with caching

Created 7 months ago

Updated 1 month ago

TaylorSeer by Shenyi-Z

Accelerating diffusion models with predictive feature caching

Created 10 months ago

Updated 5 months ago

Awesome-DiT-Inference by xlite-dev

Awesome diffusion inference papers

Created 2 years ago

Updated 1 month ago

piecewise-rectified-flow by magic-research

PeRFlow: Plug-and-play accelerator for diffusion models (NeurIPS 2024)

Created 1 year ago

Updated 4 months ago

TeaCache by ali-vilab

Training-free caching approach for video diffusion model inference

Created 1 year ago

Updated 7 months ago

stable-diffusion-xl-demo by TonyLianLong

Gradio WebUI demo for Stable Diffusion XL 1.0

Created 2 years ago

Updated 1 year ago

Starred by

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory) and

Luis Capelo

Luis Capelo(Cofounder of Lightning AI).

onediff by siliconflow

Acceleration library for diffusion models

Created 3 years ago

Updated 1 month ago

Starred by

Yineng Zhang

Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI),

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory), and

1 more.

FastVideo by hao-ai-lab

Framework for accelerated video generation

Created 1 year ago

Updated 1 day ago

Starred by

Yineng Zhang

Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI),

Rodrigo Nader

Rodrigo Nader(Cofounder of Langflow), and

1 more.

DiffSynth-Studio by modelscope

Open-source project for diffusion model exploration

Created 2 years ago

Updated 3 days ago

Feedback? Help us improve.