Discover and explore top open-source AI tools and projects—updated daily.
chengzeyiInference optimization solution for ComfyUI
Top 32.9% on SourcePulse
This project provides inference optimization for ComfyUI, targeting users seeking faster image and video generation. It offers universal, flexible, and fast solutions through dynamic caching and enhanced torch.compile integration, aiming to significantly reduce computation costs and generation times.
How It Works
The core of the optimization lies in two main techniques. "First Block Cache" (FBCache) leverages the residual output of the initial transformer block. If subsequent residual outputs are similar to the previous ones, it reuses cached results, skipping later computations for up to 2x speedup. "Enhanced torch.compile" compiles model components for faster execution, notably supporting LoRA models, unlike the original TorchCompileModel node.
Quick Start & Requirements
git clone into ComfyUI's custom_nodes directory.torch.compile node has specific software/hardware requirements; refer to the Enhanced torch.compile section. FP8 quantization with torch.compile is not supported on pre-Ada GPUs (e.g., RTX 3090). torch.compile is not officially supported on Windows.workflows folder.Highlighted Details
torch.compile node works with LoRA models.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
torch.compile may have issues with model offloading and requires specific configurations for optimal performance and to avoid recompilation.torch.compile is not supported on older GPUs.torch.compile is not officially supported on Windows.3 months ago
1 day
ModelTC