Inference optimization solution for ComfyUI
Top 35.2% on sourcepulse
This project provides inference optimization for ComfyUI, targeting users seeking faster image and video generation. It offers universal, flexible, and fast solutions through dynamic caching and enhanced torch.compile
integration, aiming to significantly reduce computation costs and generation times.
How It Works
The core of the optimization lies in two main techniques. "First Block Cache" (FBCache) leverages the residual output of the initial transformer block. If subsequent residual outputs are similar to the previous ones, it reuses cached results, skipping later computations for up to 2x speedup. "Enhanced torch.compile
" compiles model components for faster execution, notably supporting LoRA models, unlike the original TorchCompileModel
node.
Quick Start & Requirements
git clone
into ComfyUI's custom_nodes
directory.torch.compile
node has specific software/hardware requirements; refer to the Enhanced torch.compile
section. FP8 quantization with torch.compile
is not supported on pre-Ada GPUs (e.g., RTX 3090). torch.compile
is not officially supported on Windows.workflows
folder.Highlighted Details
torch.compile
node works with LoRA models.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
torch.compile
may have issues with model offloading and requires specific configurations for optimal performance and to avoid recompilation.torch.compile
is not supported on older GPUs.torch.compile
is not officially supported on Windows.4 months ago
1 day