Comfy-WaveSpeed by chengzeyi

Inference optimization solution for ComfyUI

Created 1 year ago

1,205 stars

Top 32.4% on SourcePulse

1 Expert Loves This Project

andreasjansson

Andreas Jansson

Cofounder of Replicate

Project Summary

This project provides inference optimization for ComfyUI, targeting users seeking faster image and video generation. It offers universal, flexible, and fast solutions through dynamic caching and enhanced torch.compile integration, aiming to significantly reduce computation costs and generation times.

How It Works

The core of the optimization lies in two main techniques. "First Block Cache" (FBCache) leverages the residual output of the initial transformer block. If subsequent residual outputs are similar to the previous ones, it reuses cached results, skipping later computations for up to 2x speedup. "Enhanced torch.compile" compiles model components for faster execution, notably supporting LoRA models, unlike the original TorchCompileModel node.

Quick Start & Requirements

Install via git clone into ComfyUI's custom_nodes directory.
Requires ComfyUI.
torch.compile node has specific software/hardware requirements; refer to the Enhanced torch.compile section. FP8 quantization with torch.compile is not supported on pre-Ada GPUs (e.g., RTX 3090). torch.compile is not officially supported on Windows.
Demo workflows are available in the workflows folder.

Highlighted Details

First Block Cache (FBCache) offers 1.5x to 3.0x speedup with acceptable accuracy loss.
Supports various models including FLUX, LTXV, HunyuanVideo, SD3.5, and SDXL.
Enhanced torch.compile node works with LoRA models.
FBCache is incompatible with the FreeU Advanced node pack for SDXL.

Maintenance & Community

Project is marked as [WIP] (Work In Progress).
Users are encouraged to join the Discord server for requests and questions.

Licensing & Compatibility

License is not explicitly stated in the README.

Limitations & Caveats

Multi-GPU inference is listed as a future feature ([WIP]).
torch.compile may have issues with model offloading and requires specific configurations for optimal performance and to avoid recompilation.
FP8 quantization with torch.compile is not supported on older GPUs.
torch.compile is not officially supported on Windows.

Health Check

Last Commit

5 months ago

Responsiveness

1 day

Pull Requests (30d)

0

Issues (30d)

1

Star History

12 stars in the last 30 days

Explore Similar Projects

ComfyUI-MagCache by Zehong-Ma

Accelerating diffusion model inference with caching

Created 7 months ago

Updated 1 month ago

slowllama by okuvshynov

LoRA finetuning for large language models on limited-memory devices

Created 2 years ago

Updated 1 year ago

Starred by

Ying Sheng

Ying Sheng(Coauthor of SGLang).

ScaleLLM by vectorch-ai

LLM inference system for production environments

Created 2 years ago

Updated 3 weeks ago

cache-dit by vipshop

Accelerate diffusion transformer inference with unified caching

Created 7 months ago

Updated 2 days ago

Starred by

Robin Huang

Robin Huang(Cofounder of Comfy Org).

ComfyUI-TeaCache by welltop-cn

ComfyUI extension for inference speedup

Created 1 year ago

Updated 6 months ago

ComfyUI-PuLID-Flux by balazik

ComfyUI implementation for PuLID-Flux

Created 1 year ago

Updated 1 year ago

Starred by

Jeremy Howard

Jeremy Howard(Cofounder of fast.ai) and

Zhuohan Li

Zhuohan Li(Coauthor of vLLM).

marlin by IST-DASLab

FP16xINT4 kernel for fast LLM inference

Created 2 years ago

Updated 1 year ago

ComfyUI-CogVideoXWrapper by kijai

ComfyUI nodes for CogVideoX models

Created 1 year ago

Updated 5 months ago

musubi-tuner by kohya-ss

LoRA training/inference scripts for video diffusion models

Created 1 year ago

Updated 13 hours ago

Starred by

Tim J. Baek

Tim J. Baek(Founder of Open WebUI).

onnxruntime-genai by microsoft

GenAI extension for running LLMs with ONNX Runtime

Created 2 years ago

Updated 2 days ago

ComfyUI-WanVideoWrapper by kijai

ComfyUI nodes for advanced video generation

Created 10 months ago

Updated 3 days ago

ComfyUI-Workflows-ZHO by ZHO-ZHO-ZHO

ComfyUI workflows collection

Created 1 year ago

Updated 1 year ago

Feedback? Help us improve.