clip-interrogator  by pharmapsychotic

Image-to-prompt tool for text-to-image models

created 3 years ago
2,872 stars

Top 16.9% on sourcepulse

GitHubView on GitHub
Project Summary

This tool generates descriptive text prompts for text-to-image models based on input images, aiding users in creating similar artwork. It is designed for artists, designers, and AI enthusiasts looking to leverage existing visuals for new creations.

How It Works

The CLIP Interrogator combines OpenAI's CLIP and Salesforce's BLIP models to analyze an input image and produce optimized text prompts. It leverages pre-trained CLIP models, allowing users to select specific versions (e.g., ViT-L-14/openai for Stable Diffusion 1.X, ViT-H-14/laion2b_s32b_b79k for Stable Diffusion 2.0) for tailored prompt generation.

Quick Start & Requirements

  • Install via pip: pip install clip-interrogator==0.5.4 (or 0.6.0 for BLIP2 support).
  • Requires PyTorch with GPU support (e.g., pip3 install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu117).
  • Default settings require ~6.3GB VRAM; low VRAM settings (~2.7GB) are available.
  • Official documentation and examples are available in the repository.

Highlighted Details

  • Supports custom prompt ranking against user-defined term lists.
  • Offers a Stable Diffusion Web UI Extension for integrated use.
  • Available on Colab, HuggingFace, and Replicate for easy access.
  • Configurable with options for CLIP model selection, caching, and VRAM optimization.

Maintenance & Community

The project is actively maintained, with recent updates including BLIP2 support. Community engagement can be found via the project's GitHub repository.

Licensing & Compatibility

The project is released under an unspecified license. Compatibility for commercial use or closed-source linking is not explicitly detailed.

Limitations & Caveats

The specific license details require further investigation for commercial applications. The project is primarily focused on prompt generation for specific text-to-image models and may not cover all image-to-prompt use cases.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
63 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.