cache-dit by vipshop

Accelerate diffusion transformer inference with unified caching

Created 7 months ago

879 stars

Top 41.0% on SourcePulse

Project Summary

A Unified Cache Acceleration Toolbox for 🤗Diffusers: FLUX.1, Qwen-Image-Edit, Qwen-Image, Qwen-Image-Lightning, Wan 2.1/2.2, etc.

cache-dit is a Python toolbox accelerating Diffusion Transformer (DiT) models within 🤗Diffusers. It offers training-free cache acceleration via techniques like DBCache and TaylorSeer, significantly speeding up inference. Targeting researchers and engineers, it provides a unified API for easy integration across numerous DiT architectures.

How It Works

The library reduces redundant computations using caching mechanisms. Its unified API (cache_dit.enable_cache) simplifies integration. Key techniques include DBCache, which balances performance and precision through configurable blocks (Fn, Bn), and Hybrid TaylorSeer for improved accuracy with larger cache steps using Taylor series expansion. It also supports CFG caching and torch.compile compatibility.

Quick Start & Requirements

Install via pip: pip install -U cache-dit. Requires Python, 🤗Diffusers, and PyTorch. GPU acceleration is recommended. Repository examples and documentation detail integration for specific models.

Highlighted Details

Supports numerous DiT models (Qwen-Image, FLUX.1, Wan 2.1/2.2, SD 3/3.5, etc.).
Achieves significant speedups (up to 3.3x reported) with configurations like FP8 quantization and torch.compile.
Features an Automatic Block Adapter for custom Transformer blocks.
Includes a CLI for evaluating accuracy metrics (PSNR, FID).

Maintenance & Community

Primarily associated with "vipshop.com". Community contribution is encouraged via GitHub stars and CONTRIBUTE.md. No specific community channels or roadmap details are provided.

Licensing & Compatibility

The license type is not specified in the provided README. Compatible with 🤗Diffusers and torch.compile.

Limitations & Caveats

Unified cache APIs are experimental. torch.compile with dynamic shapes may require torch._dynamo recompile limit adjustments. Project authorship appears concentrated, potentially indicating a low bus factor.

cache-dit by vipshop

A Unified Cache Acceleration Toolbox for 🤗Diffusers: FLUX.1, Qwen-Image-Edit, Qwen-Image, Qwen-Image-Lightning, Wan 2.1/2.2, etc.

A Unified Cache Acceleration Toolbox for 🤗Diffusers: FLUX.1, Qwen-Image-Edit, Qwen-Image, Qwen-Image-Lightning, Wan 2.1/2.2, etc.

Explore Similar Projects

MagCache by Zehong-Ma

ComfyUI-MagCache by Zehong-Ma

Awesome-DiT-Inference by xlite-dev

dInfer by inclusionAI

DeepCache by horseee

TeaCache by ali-vilab

ComfyUI-TeaCache by welltop-cn

GPTFast by MDK8888

marlin by IST-DASLab

nunchaku by nunchaku-ai

xDiT by xdit-project

stable-diffusion-webui-forge by lllyasviel