clip-retrieval  by rom1504

CLIP retrieval system for semantic search

created 4 years ago
2,613 stars

Top 18.4% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a comprehensive toolkit for building scalable CLIP-based retrieval systems, enabling users to compute embeddings, create efficient indices, and serve them via a web API. It's designed for researchers and developers working with large-scale multimodal datasets who need to implement semantic search capabilities.

How It Works

The system leverages CLIP models to generate embeddings for text and images. It then uses autofaiss for efficient Approximate Nearest Neighbor (ANN) indexing, allowing for fast retrieval over millions or billions of items. A Flask-based backend (clip-back) serves these indices, offering a REST API for querying, with optional features like HDF5/Arrow caching for metadata and memory-mapped indices to reduce RAM usage.

Quick Start & Requirements

  • Install via pip: pip install clip-retrieval
  • Requires Python 3.7+ and PyTorch. GPU with CUDA is recommended for performance.
  • Official documentation and examples are available on the GitHub repository.

Highlighted Details

  • Processes 100M text/image embeddings in 20 hours on a 3080 GPU.
  • Achieves 1500 samples/sec for CLIP inference on a 3080.
  • Supports various CLIP models, including OpenCLIP and Hugging Face variants.
  • Offers optional DeepSparse backend for CPU-accelerated inference.

Maintenance & Community

  • The project is actively maintained by Romain Beaumont.
  • Related projects include img2dataset, open_clip, and CLIP_benchmark.
  • Community discussion can be found via the DataToML chat.

Licensing & Compatibility

  • The project is released under the MIT License.
  • Permissive licensing allows for commercial use and integration into closed-source projects.

Limitations & Caveats

The project's performance heavily relies on hardware, particularly GPUs for inference and sufficient RAM for indexing. While it scales to billions of samples, managing such large datasets requires careful configuration and potentially distributed computing setups. The clip-front UI is basic and may require customization for production use.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
80 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.