Image-to-prompt tool for text-to-image models
Top 16.9% on sourcepulse
This tool generates descriptive text prompts for text-to-image models based on input images, aiding users in creating similar artwork. It is designed for artists, designers, and AI enthusiasts looking to leverage existing visuals for new creations.
How It Works
The CLIP Interrogator combines OpenAI's CLIP and Salesforce's BLIP models to analyze an input image and produce optimized text prompts. It leverages pre-trained CLIP models, allowing users to select specific versions (e.g., ViT-L-14/openai for Stable Diffusion 1.X, ViT-H-14/laion2b_s32b_b79k for Stable Diffusion 2.0) for tailored prompt generation.
Quick Start & Requirements
pip install clip-interrogator==0.5.4
(or 0.6.0
for BLIP2 support).pip3 install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu117
).Highlighted Details
Maintenance & Community
The project is actively maintained, with recent updates including BLIP2 support. Community engagement can be found via the project's GitHub repository.
Licensing & Compatibility
The project is released under an unspecified license. Compatibility for commercial use or closed-source linking is not explicitly detailed.
Limitations & Caveats
The specific license details require further investigation for commercial applications. The project is primarily focused on prompt generation for specific text-to-image models and may not cover all image-to-prompt use cases.
1 year ago
Inactive