ComfyUI-CacheDiT  by Jasonzzt

GenAI DiT model acceleration for ComfyUI

Created 2 months ago
277 stars

Top 93.4% on SourcePulse

GitHubView on GitHub
Project Summary

ComfyUI-CacheDiT provides a one-click solution to accelerate Diffusion Transformer (DiT) models within the ComfyUI environment. It targets users of ComfyUI who leverage DiT architectures for image and video generation, offering significant speedups (1.4-2.0x) with minimal configuration and no perceivable quality degradation. The primary benefit is reducing inference times for computationally intensive DiT models.

How It Works

The node implements an intelligent caching strategy inspired by llm-scaler. After an initial warmup phase, it selectively reuses previously computed intermediate results based on a skip_interval and the current inference step. If the conditions are met, cached data is utilized; otherwise, new computations are performed, and the result is cached for future steps. This approach minimizes redundant computations, leading to substantial performance gains.

Quick Start & Requirements

Highlighted Details

  • Achieves 1.4-2.0x speedup across various DiT models including Z-Image, Z-Image-Turbo, Qwen-Image-2512, Flux.2 Klein, LTX-2, and WAN2.2 14B.
  • Features "one-click" acceleration with automatic parameter tuning, requiring zero manual configuration for most models.
  • Includes dedicated nodes (LTX2 Cache Optimizer, Wan Cache Optimizer) for specialized architectures like LTX-2 and WAN2.2 14B to ensure optimal performance and temporal consistency.
  • Reports minimal impact on image quality when acceleration is properly configured using default settings.

Maintenance & Community

No specific details regarding maintainers, sponsorships, partnerships, or community channels (like Discord/Slack) are provided in the README.

Licensing & Compatibility

  • License: Apache 2.0.
  • Compatibility: Designed for ComfyUI. The Apache 2.0 license is permissive, generally allowing for commercial use and integration into closed-source projects.

Limitations & Caveats

The speedup benefit is significantly reduced for inference tasks with very low step counts (less than 6 steps) due to warmup overhead. Model auto-detection may occasionally fail, requiring manual selection of the model_type preset. A 0% cache hit rate can occur if the model is not detected, inference steps are too short, or specific log messages are absent. Support for distilled low-step models beyond Z-Image-Turbo requires further validation.

Health Check
Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
30 stars in the last 30 days

Explore Similar Projects

Starred by Alex Yu Alex Yu(Research Scientist at OpenAI; Cofounder of Luma AI) and Cody Yu Cody Yu(Coauthor of vLLM; MTS at OpenAI).

xDiT by xdit-project

0.1%
3k
Inference engine for parallel Diffusion Transformer (DiT) deployment
Created 2 years ago
Updated 2 days ago
Starred by Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), Elvis Saravia Elvis Saravia(Founder of DAIR.AI), and
2 more.

vllm-omni by vllm-project

5.5%
4k
Omni-modality model inference and serving framework
Created 7 months ago
Updated 5 hours ago
Feedback? Help us improve.