Comfyui_CXH_joy_caption by StartHua

ComfyUI extension for image captioning and tagging workflows

Created 1 year ago

625 stars

Top 52.1% on SourcePulse

Project Summary

This repository provides ComfyUI nodes for advanced image captioning and prompt generation, targeting users of Stable Diffusion and similar generative AI models. It integrates multiple powerful models like Joy_caption, MiniCPMv2_6, and Florence-2, enabling efficient batch processing and enhanced creative control for AI art generation.

How It Works

The project offers ComfyUI nodes that leverage state-of-the-art models for image analysis and text generation. It supports Joy_caption for detailed image descriptions, MiniCPMv2_6 for prompt generation, and Florence-2 for versatile captioning and prompt engineering. This modular approach allows users to combine different models for tailored workflows, aiming for faster processing and higher quality outputs compared to single-model solutions.

Quick Start & Requirements

Install dependencies: python -m pip install -r requirements.txt or run install_req.bat.
Ensure transformers library is up-to-date.
Models can be automatically downloaded by ComfyUI or manually placed in specified directories (e.g., models\Joy_caption_alpha, clip/siglip-so400m-patch14-384, LLM/Meta-Llama-3.1-8B-bnb-4bit).
Manual download is recommended for some models, with links provided in the README.
See ComfyUI for the base environment.

Highlighted Details

Supports batch folder tagging and batch image classification.
Claims processing speeds: Florence-2 < MiniCPMv2_6 < Joy_caption (4-5 seconds per image on a 4090).
Integrates MiniCPM3-4B for strong chat, translation, and rewriting capabilities.
Includes support for Florence-2-large-PromptGen-v1.5 and Florence-2-base-PromptGen-v1.5.

Maintenance & Community

Project activity and updates are indicated by recent date stamps in the README (e.g., 2024-10-30, 2024-10-16).
Model download links point to Hugging Face and Baidu Netdisk.

Licensing & Compatibility

The README does not explicitly state a license.
Compatibility with commercial or closed-source projects is not specified.

Limitations & Caveats

The project relies on external model downloads, some of which require manual intervention. Specific version requirements for dependencies like transformers are noted, and compatibility with different ComfyUI versions is not detailed.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days