CLIPPyX  by 0ssamaak0

AI-powered search tool for content-based image and text similarity

created 1 year ago
257 stars

Top 98.8% on sourcepulse

GitHubView on GitHub
Project Summary

CLIPPyX offers system-wide search capabilities for text and images, leveraging AI for content-based and visual similarity. It targets users needing to efficiently locate files based on their visual content, embedded text, or descriptive captions, enhancing productivity for digital asset management.

How It Works

CLIPPyX utilizes OpenAI's CLIP model to generate image embeddings, storing them in a vector database for efficient similarity searches. It also employs OCR to extract text from images, embedding this text with a separate model for semantic text search. A Flask server handles search queries from various UIs, querying both image and text embeddings to return relevant results.

Quick Start & Requirements

  • Install via pip install -e . after cloning the repository.
  • Requires PyTorch.
  • Supports various CLIP and text embedding models from Hugging Face, Apple's MobileClip, Ollama, llama.cpp, and OpenAI-compatible APIs.
  • UI integration is supported via HTTP requests to the server.

Highlighted Details

  • Search by image caption, textual content within images (via OCR), and visual similarity.
  • Integrates with popular launchers like RayCast (macOS), Flow Launcher (Windows), and PowerToys Run (Windows).
  • Offers a "Deep Scan" option for re-indexing changed file content even with the same filename.
  • Supports multiple embedding providers for flexibility.

Maintenance & Community

  • Project sponsored by Cohere, with an alternative installation branch using Cohere Multimodal Embed 3.
  • Open to feature requests and bug reports via GitHub Issues.

Licensing & Compatibility

  • The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is undetermined.

Limitations & Caveats

The project is described as a development server, advising against production deployment without a dedicated WSGI server. The "Deep Scan" feature may impact performance on large directories. License information is not provided, which may affect commercial adoption.

Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
16 stars in the last 90 days

Explore Similar Projects

Starred by John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), Chenlin Meng Chenlin Meng(Cofounder of Pika), and
4 more.

clip-retrieval by rom1504

0.3%
3k
CLIP retrieval system for semantic search
created 4 years ago
updated 1 year ago
Feedback? Help us improve.