Tool for cropping images based on text queries
Top 82.2% on sourcepulse
This project enables semantic search within images and YouTube videos using text descriptions, allowing users to find and crop specific subjects. It's designed for users interested in content discovery, dataset creation, and exploring advanced image search capabilities.
How It Works
The system combines YOLOv5 for object detection with OpenAI's CLIP model. YOLOv5 identifies and crops objects based on its pre-trained COCO dataset classes. These cropped images are then encoded using CLIP, alongside the text search query, to find the best semantic match.
Quick Start & Requirements
pip install -r requirements.txt
.Highlighted Details
Maintenance & Community
The project acknowledges contributions from Ramsri Goutham Golla and OpenAI. No specific community channels or roadmap are detailed in the README.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The accuracy and scope of object detection are limited by the classes available in the COCO dataset, which YOLOv5 is pre-trained on.
2 years ago
Inactive