Crop-CLIP  by vijishmadhavan

Tool for cropping images based on text queries

created 3 years ago
340 stars

Top 82.2% on sourcepulse

GitHubView on GitHub
Project Summary

This project enables semantic search within images and YouTube videos using text descriptions, allowing users to find and crop specific subjects. It's designed for users interested in content discovery, dataset creation, and exploring advanced image search capabilities.

How It Works

The system combines YOLOv5 for object detection with OpenAI's CLIP model. YOLOv5 identifies and crops objects based on its pre-trained COCO dataset classes. These cropped images are then encoded using CLIP, alongside the text search query, to find the best semantic match.

Quick Start & Requirements

  • Install via pip install -r requirements.txt.
  • Requires Python 3.x.
  • A Hugging Face Spaces web app is available for a no-setup experience.

Highlighted Details

  • Supports searching within YouTube videos by analyzing frames.
  • Can be adapted for batch processing to create datasets of specific objects.
  • Demonstrates use cases like finding "Man in suit" or "Whiskey Bottle" in visual media.

Maintenance & Community

The project acknowledges contributions from Ramsri Goutham Golla and OpenAI. No specific community channels or roadmap are detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The accuracy and scope of object detection are limited by the classes available in the COCO dataset, which YOLOv5 is pre-trained on.

Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.