ComfyUI-CogVideoXWrapper  by kijai

ComfyUI nodes for CogVideoX models

Created 1 year ago
1,523 stars

Top 27.2% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides ComfyUI nodes for integrating and utilizing CogVideoX models, enabling users to generate and manipulate videos using advanced AI techniques. It targets users familiar with ComfyUI and AI video generation, offering enhanced control and support for various CogVideoX variants and experimental features.

How It Works

The wrapper integrates different CogVideoX models (CogVideoX 1.5, CogVideoX-Fun, Tora, and official I2V versions) into the ComfyUI node-based workflow. It supports various generation pipelines including text-to-video, image-to-video, and pose-to-video, with features like LoRA integration, context windowing, GGUF model support, and experimental optimizations like sageattention and onediff. The architecture allows for flexible configuration of model parameters, resolutions, and sampling methods.

Quick Start & Requirements

  • Install via ComfyUI custom nodes.
  • Requires ComfyUI, Python 3.10+, diffusers (0.30.1+), torch (2.4.0+ for onediff).
  • Linux and specific hardware (e.g., NVIDIA GPU with CUDA 12+) may be required for certain optimizations like sageattention and onediff.
  • Refer to the WIP spreadsheet for model support and features.
  • Example workflows are available in the example_workflows folder.

Highlighted Details

  • Supports CogVideoX 1.5, CogVideoX-Fun 1.1, Tora, and official I2V models.
  • Integrates LoRA weights and offers features like context windowing and temporal tiling.
  • Experimental optimizations include sageattention (Linux only, ~20-30% speedup) and onediff (~40% sampling time reduction).
  • Includes nodes for advanced workflows like "cut and drag" video input generation.

Maintenance & Community

  • Actively developed with frequent updates and refactoring (e.g., Update 8 introduced breaking changes).
  • Links to related projects like CogVideoX, CogVideoX-Fun, and cogvideox-controlnet are provided.

Licensing & Compatibility

  • The repository itself appears to be under a permissive license, but users must adhere to the licenses of the underlying CogVideoX models they download and use.

Limitations & Caveats

  • Frequent breaking changes require users to update their workflows regularly.
  • Some experimental features, like sageattention and onediff, have specific OS or dependency requirements.
  • The project is marked as "WORK IN PROGRESS," indicating ongoing development and potential instability.
Health Check
Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Zack Li Zack Li(Cofounder of Nexa AI), and
19 more.

LLaVA by haotian-liu

0.2%
24k
Multimodal assistant with GPT-4 level capabilities
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.