clip-retrieval  by rom1504

CLIP retrieval system for semantic search

Created 4 years ago
2,645 stars

Top 17.9% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a comprehensive toolkit for building scalable CLIP-based retrieval systems, enabling users to compute embeddings, create efficient indices, and serve them via a web API. It's designed for researchers and developers working with large-scale multimodal datasets who need to implement semantic search capabilities.

How It Works

The system leverages CLIP models to generate embeddings for text and images. It then uses autofaiss for efficient Approximate Nearest Neighbor (ANN) indexing, allowing for fast retrieval over millions or billions of items. A Flask-based backend (clip-back) serves these indices, offering a REST API for querying, with optional features like HDF5/Arrow caching for metadata and memory-mapped indices to reduce RAM usage.

Quick Start & Requirements

  • Install via pip: pip install clip-retrieval
  • Requires Python 3.7+ and PyTorch. GPU with CUDA is recommended for performance.
  • Official documentation and examples are available on the GitHub repository.

Highlighted Details

  • Processes 100M text/image embeddings in 20 hours on a 3080 GPU.
  • Achieves 1500 samples/sec for CLIP inference on a 3080.
  • Supports various CLIP models, including OpenCLIP and Hugging Face variants.
  • Offers optional DeepSparse backend for CPU-accelerated inference.

Maintenance & Community

  • The project is actively maintained by Romain Beaumont.
  • Related projects include img2dataset, open_clip, and CLIP_benchmark.
  • Community discussion can be found via the DataToML chat.

Licensing & Compatibility

  • The project is released under the MIT License.
  • Permissive licensing allows for commercial use and integration into closed-source projects.

Limitations & Caveats

The project's performance heavily relies on hardware, particularly GPUs for inference and sufficient RAM for indexing. While it scales to billions of samples, managing such large datasets requires careful configuration and potentially distributed computing setups. The clip-front UI is basic and may require customization for production use.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
26 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Simon Willison Simon Willison(Coauthor of Django).

semantra by freedmand

0.1%
3k
CLI tool for semantic document search
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.