ComfyUI nodes for local/API LLMs & LMMs
Top 51.7% on sourcepulse
This repository provides custom nodes for ComfyUI, enabling users to integrate local Large Language Models (LLMs) and Large Multimodal Models (LMMs) directly into their image generation workflows. It targets users seeking to leverage advanced AI capabilities like OCR, RAG, and object detection for prompt enhancement and content creation.
How It Works
The nodes facilitate interaction with various LLM backends, including Ollama, LlamaCPP, LMstudio, TextGen, and Transformers, as well as cloud APIs from providers like OpenAI, Google Gemini, and Anthropic. It supports multimodal inputs and features advanced RAG techniques like nanoGraphRAG and OCR-RAG, alongside object detection with Florence2. Users can also define custom "assistant" characters with specific system prompts and presets.
Quick Start & Requirements
scoop install poppler
(Windows), sudo apt-get install poppler-utils
(Debian/Ubuntu), or brew install poppler
(macOS).ollama run llama3.2-vision
.XAI_API_KEY
, GOOGLE_API_KEY
) or use a .env
file for cloud API access.Highlighted Details
llama3_if_ai_sdpromptmkr_q4km
.Maintenance & Community
The project is actively developed, with a stated intention to move prompt generation to a separate repository (ComfyUI-IF_AI_PromptImaGen
). Support is available via GitHub starring, YouTube channel (Impact Frames
), X (Impact Frames X
), Ko-fi, and Patreon.
Licensing & Compatibility
The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project is marked with a TODO list indicating ongoing development, including bug fixes for the latest ComfyUI versions and improvements to the Graph Visualizer and IF_Assistants nodes. Frontend development for assistants and chat is also planned.
4 months ago
1 day