ComfyUI integration for Google's Gemini models
Top 46.4% on sourcepulse
This repository provides custom nodes for ComfyUI, enabling users to integrate Google's Gemini large language models. It targets AI artists, researchers, and developers using ComfyUI for generative tasks, offering enhanced prompt generation, image description, and conversational AI capabilities directly within their existing workflows.
How It Works
The nodes leverage the Gemini API to interact with three models: Gemini-pro (text), Gemini-pro-vision (text + image), and Gemini 1.5 Pro (text + image + file). It supports both implicit API key management (via environment variables for security) and explicit key input. Key features include multimodal input (images, URLs, and large files up to 20GB for Gemini 1.5 Pro), system instruction support, and conversational memory for chatbot functionalities.
Quick Start & Requirements
cd custom_nodes && git clone https://github.com/ZHO-ZHO-ZHO/ComfyUI-Gemini.git && cd ComfyUI-Gemini && pip install -r requirements.txt
.google-generativeai
(version > 0.4.1 recommended for Gemini 1.5 Pro).Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 year ago
Inactive