faiss_tips  by matsui528

Faiss tips and tricks

created 7 years ago
622 stars

Top 53.9% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides practical code snippets and explanations for using the Faiss library, a highly efficient similarity search and clustering library. It's targeted at developers and researchers working with large-scale datasets who need to implement nearest neighbor search, approximate nearest neighbor search, or k-means clustering. The primary benefit is to offer clear, runnable examples that demonstrate various Faiss functionalities and optimization techniques.

How It Works

The repository showcases Faiss's core capabilities through Python code examples. It covers fundamental operations like setting up IndexFlatL2 for exact nearest neighbor search, leveraging GPU acceleration with index_cpu_to_all_gpus, and implementing approximate nearest neighbor search using combinations like HNSW and IVFPQ for large datasets. It also demonstrates Faiss's optimized k-means implementation and methods for index I/O and merging results.

Quick Start & Requirements

  • Installation: conda install faiss-cpu -c pytorch or conda install faiss-gpu -c pytorch.
  • Prerequisites: Python, NumPy. GPU usage requires CUDA-enabled hardware and drivers.
  • Setup: Minimal setup for basic CPU usage. GPU setup requires environment variables like CUDA_VISIBLE_DEVICES.
  • Resources: Examples demonstrate usage with datasets up to billion-scale, requiring significant RAM and potentially GPU memory.
  • Links: Faiss GitHub

Highlighted Details

  • Demonstrates GPU acceleration for significant speedups in nearest neighbor search.
  • Provides examples for approximate nearest neighbor search (ANN) using HNSW and IVFPQ, crucial for large-scale data.
  • Includes optimized k-means clustering, outperforming standard libraries.
  • Shows how to manage Faiss index persistence via binary files, NumPy arrays, and pickle.

Maintenance & Community

This repository appears to be a personal collection of tips rather than a formally maintained project. The core Faiss library is actively developed by Meta AI.

Licensing & Compatibility

The code snippets themselves are likely under a permissive license, mirroring the Faiss library's MIT license. This allows for commercial use and integration into closed-source projects.

Limitations & Caveats

This repository is a collection of tips and examples, not a comprehensive library or framework. Users are expected to have a foundational understanding of Faiss and vector similarity search concepts. Some advanced configurations or edge cases might not be covered.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.