clifs by johanmodin

Video search via text queries using CLIP

Created 4 years ago

480 stars

Top 63.8% on SourcePulse

Project Summary

CLIFS (Contrastive Language-Image Forensic Search) is a proof-of-concept tool for performing free-text searches within video content. It leverages OpenAI's CLIP model to match textual queries with visual frames, enabling users to find specific scenes or objects in videos using natural language. This is particularly useful for forensic analysis or content discovery in large video datasets.

How It Works

CLIFS extracts features from video frames using CLIP's image encoder. Search queries are processed by CLIP's text encoder to generate corresponding features. Similarity matching between frame and query features identifies relevant video segments. Results exceeding a defined similarity threshold are returned. A Django web server provides a user interface for interacting with the search engine.

Quick Start & Requirements

Install and run via Docker Compose: sh ./setup.sh followed by sh docker-compose build && docker-compose up.
GPU support requires docker-compose -f docker-compose-gpu.yml up.
Place video files in the data/input directory.
Access the interface at 127.0.0.1:8000.
Requires Docker and NVIDIA GPU drivers (for GPU support).

Highlighted Details

Utilizes OpenAI's CLIP model for language-image matching.
Capable of Optical Character Recognition (OCR) within video frames.
Demonstrates searching for specific objects, text, and descriptions (e.g., "A truck with the text 'odwalla'", "A white BMW car").

Maintenance & Community

No specific information on contributors, community channels, or roadmap is provided in the README.

Licensing & Compatibility

The README does not specify a license. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

This is described as a proof-of-concept, suggesting potential limitations in robustness and scalability. The README does not detail performance benchmarks or specific unsupported video formats.

clifs by johanmodin

Explore Similar Projects

DeepVideoDiscovery by microsoft

sola by soruly

CLIPPyX by 0ssamaak0

machina by PsyChip

collaborative-experts by albanie

vced by datawhalechina

MiniGPT4-video by Vision-CAIR

natural-language-youtube-search by haltakov

yt-fts by NotJoeMartinez

natural-language-image-search by haltakov

ddgs by deedy5

clip-retrieval by rom1504