Discover and explore top open-source AI tools and projects—updated daily.
EvilBTComfyUI node for image captioning
Top 50.7% on SourcePulse
This repository provides a ComfyUI node for advanced image captioning using the JoyCaptionAlpha Two model. It's designed for users involved in AI image generation and training, offering enhanced control over caption generation for batch processing and fine-tuning.
How It Works
The node integrates the JoyCaptionAlpha Two model into the ComfyUI workflow, enabling users to generate detailed captions for images. It supports advanced batch processing features like adding custom prefixes and suffixes to captions, facilitating organized dataset preparation for model training. The implementation allows for fine-tuning caption generation parameters such as top_p and temperature.
Quick Start & Requirements
custom_nodes and install dependencies with pip install -r ComfyUI_SLK_joy_caption_two/requirements.txt.google/siglip-so400m-patch14-384 and the Joy-Caption-alpha-two model, which should be placed in specific subdirectories within ComfyUI's models folder. Llama 3.1 8B models can be automatically downloaded or manually placed.examples/workflows.png file.Highlighted Details
top_p and temperature parameters.unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit).Maintenance & Community
The project is actively maintained with recent updates addressing bugs and adding features. Users can report issues via GitHub issues.
Licensing & Compatibility
The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The node is noted as "not fully tested" and users are encouraged to report issues. The README mentions it was tested in an 8GB VRAM environment, suggesting potential VRAM requirements for optimal performance.
4 months ago
1 day
kohjingyu
ttengwang
rmokady