comfyui_dagthomas  by dagthomas

ComfyUI extension for advanced prompt/image processing

created 1 year ago
258 stars

Top 98.6% on sourcepulse

GitHubView on GitHub
Project Summary

This ComfyUI extension provides advanced prompt generation and image analysis capabilities, targeting users who want to enhance their AI image creation workflows. It offers nodes for GPT-4 powered text generation, image description via GPT-4 Vision, local LLM integration with Ollama, and sophisticated prompt structuring with dynamic category-based generation.

How It Works

The extension introduces several custom nodes. PromptGenerator and APNextNode allow for structured and randomized prompt creation, pulling elements from user-defined JSON files organized into categories. GPT4VisionNode leverages GPT-4 Vision to analyze images and generate detailed descriptions, with options for output detail and length. GPT4MiniNode and OllamaNode provide text generation capabilities using OpenAI's GPT-4 and local Ollama models, respectively, supporting custom base prompts and output formatting. A PGSD3LatentGenerator is also included for Stable Diffusion 3 latent creation.

Quick Start & Requirements

  • Installation: Add the repository to your ComfyUI custom nodes directory.
  • OpenAI API Key: Required for GPT4VisionNode and GPT4MiniNode. Set as an environment variable: OPENAI_API_KEY=sk-your-api-key-here.
  • Ollama: Required for OllamaNode.
  • Dependencies: Additional Python packages as specified in import statements.
  • Custom Categories: Create JSON files in comfyui_dagthomas/data/next/[CATEGORY_NAME]/ for APNextNode customization.
  • Example Workflow: Download apntest.json.
  • Documentation: Detailed documentation is in progress.

Highlighted Details

  • Dynamic Prompt Generation: APNextNode allows users to define custom categories and fields via JSON files, enabling highly flexible and repeatable prompt construction.
  • GPT-4 Vision Integration: Enables image-to-text analysis for detailed image descriptions, useful for prompt seeding or content moderation.
  • Local LLM Support: OllamaNode allows integration with local language models, offering an alternative to cloud-based APIs.
  • SD3 Latent Generation: Includes a node specifically for generating latents compatible with Stable Diffusion 3 pipelines.

Maintenance & Community

  • The project is marked as "beta" with documentation in progress.
  • No specific community links (Discord, Slack) or notable contributors are mentioned in the README.

Licensing & Compatibility

  • The README does not explicitly state a license.

Limitations & Caveats

  • The project is in beta, and detailed documentation is still being developed.
  • Functionality relies heavily on correctly structured JSON files for custom categories.
  • Usage of OpenAI nodes requires an API key and incurs costs.
Health Check
Last commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
16 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.