ComfyUI_omost  by huchenlei

ComfyUI nodes for regional prompt-driven image generation

created 1 year ago
448 stars

Top 68.1% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides ComfyUI nodes for Omost, a framework for regional prompting in diffusion models. It enables users to interact with Large Language Models (LLMs) to generate structured prompts that define specific regions and content within an image, facilitating precise control over image generation.

How It Works

The core functionality revolves around an LLM Chat interface, where users converse with an LLM to produce a JSON-like structure detailing image regions, associated prompts (prefixes/suffixes), and color information. This structured data then guides the diffusion process. The implementation supports multiple regional prompting methods, including built-in attention masking (Overlay/Average) and integration with Dense Diffusion for more advanced attention score manipulation.

Quick Start & Requirements

  • Install via ComfyUI custom nodes.
  • Requires ComfyUI.
  • For accelerated LLM inference, Text Generation Inference (TGI) or llama.cpp with GGUF models is recommended. TGI requires ~20GB VRAM for an 8B LLM.
  • Official documentation: https://github.com/huchenlei/ComfyUI_omost

Highlighted Details

  • LLM Chat nodes for interactive prompt generation.
  • Canvas editor for manual region and prompt manipulation.
  • Support for OmostDenseDiffusion backend for advanced regional control.
  • Options to connect to external LLM services (TGI, llama.cpp) for accelerated inference.

Maintenance & Community

  • Active development with recent updates in June 2024.
  • Links to potential community channels are not explicitly provided in the README.

Licensing & Compatibility

  • The repository itself appears to be under a permissive license, but it integrates with other projects (Omost, ComfyUI, DenseDiffusion, TGI, llama.cpp) which have their own licenses. Users should verify compatibility, especially for commercial use.

Limitations & Caveats

  • Some advanced regional prompting methods (gradient optimization, external control models) are listed as "To be implemented."
  • The base LLM inference can be slow (3-5 minutes per chat on a 4090) without acceleration.
  • ComfyUI_densediffusion does not compose with IPAdapter.
Health Check
Last commit

5 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
10 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.