Crop-CLIP by vijishmadhavan

Tool for cropping images based on text queries

Created 4 years ago

341 stars

Top 81.0% on SourcePulse

Project Summary

This project enables semantic search within images and YouTube videos using text descriptions, allowing users to find and crop specific subjects. It's designed for users interested in content discovery, dataset creation, and exploring advanced image search capabilities.

How It Works

The system combines YOLOv5 for object detection with OpenAI's CLIP model. YOLOv5 identifies and crops objects based on its pre-trained COCO dataset classes. These cropped images are then encoded using CLIP, alongside the text search query, to find the best semantic match.

Quick Start & Requirements

Install via pip install -r requirements.txt.
Requires Python 3.x.
A Hugging Face Spaces web app is available for a no-setup experience.

Highlighted Details

Supports searching within YouTube videos by analyzing frames.
Can be adapted for batch processing to create datasets of specific objects.
Demonstrates use cases like finding "Man in suit" or "Whiskey Bottle" in visual media.

Maintenance & Community

The project acknowledges contributions from Ramsri Goutham Golla and OpenAI. No specific community channels or roadmap are detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The accuracy and scope of object detection are limited by the classes available in the COCO dataset, which YOLOv5 is pre-trained on.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

1 stars in the last 30 days

Explore Similar Projects

CLIPPyX by 0ssamaak0

AI-powered search tool for content-based image and text similarity

Created 1 year ago

Updated 7 months ago

Starred by

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind).

segment-anything-with-clip by Curt-Park

Segmentation pipeline combining Segment Anything Model (SAM) with CLIP

Created 2 years ago

Updated 1 year ago

CLIP-ImageSearch-NCNN by EdVince

Image search demo using natural language queries

Created 3 years ago

Updated 2 years ago

krita-vision-tools by Acly

AI-powered image masking and editing tools for Krita

Created 2 years ago

Updated 1 month ago

CLIP-SAM by maxi-w

Open-vocabulary image segmentation via CLIP and SAM

Created 2 years ago

Updated 2 years ago

Starred by

Jesse Clark

Jesse Clark(Cofounder of Marqo),

Elie Bursztein

Elie Bursztein(Cybersecurity Lead at Google DeepMind), and

4 more.

coyo-dataset by kakaobrain

Image-text pair dataset for vision-language model training

Created 3 years ago

Updated 3 years ago

tidy by slavabarkov

Android app for offline semantic image search

Created 2 years ago

Updated 1 year ago

rclip by yurijmikhalevich

CLI tool for AI-powered photo search

Created 4 years ago

Updated 2 months ago

OpenAI-CLIP by moein-shariatnia

PyTorch CLIP implementation for text-image retrieval

Created 4 years ago

Updated 2 months ago

clipseg by timojl

Image segmentation via text/image prompts (CVPR 2022 paper)

Created 4 years ago

Updated 2 years ago

Monkey by Yuliang-Liu

Research paper on multimodal models, image resolution, and text labels

Created 2 years ago

Updated 2 months ago

Starred by

Max Howell

Max Howell(Author of Homebrew),

Shizhe Diao

Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and

1 more.

big-sleep by lucidrains

CLI tool for text-to-image generation

Created 5 years ago

Updated 3 years ago

Feedback? Help us improve.