faiss  by facebookresearch

Similarity search library for dense vectors

created 8 years ago
36,392 stars

Top 0.8% on sourcepulse

GitHubView on GitHub
Project Summary

Faiss is a C++ library with Python bindings for efficient similarity search and clustering of dense vectors, targeting researchers and engineers working with large-scale vector datasets. It offers algorithms that can handle datasets exceeding available RAM, providing significant speedups for nearest neighbor searches.

How It Works

Faiss implements various indexing structures and search algorithms, including exact search baselines and approximate methods that use compressed vector representations. These methods trade search precision for memory efficiency and speed, enabling scaling to billions of vectors. It supports L2 (Euclidean) distance, dot product, and cosine similarity metrics. The library also features GPU acceleration via CUDA and ROCm, offering substantial performance gains for both exact and approximate search.

Quick Start & Requirements

  • Install: pip install faiss-cpu or pip install faiss-gpu (for GPU support).
  • Prerequisites: Python 3.x, NumPy. GPU version requires CUDA-compatible hardware and drivers.
  • Documentation: Full documentation, tutorials, and FAQs are available on the Faiss wiki.

Highlighted Details

  • Offers a wide range of indexing methods, balancing search time, quality, memory usage, and training time.
  • GPU implementation provides highly optimized nearest neighbor search and k-means clustering.
  • Supports both CPU and GPU memory for input/output, with automatic memory management for GPU operations.
  • Scales to billions of vectors, even on a single server, by using compressed vector representations.

Maintenance & Community

Developed primarily by Meta's Fundamental AI Research group. Community discussions and questions are hosted on GitHub Discussions.

Licensing & Compatibility

MIT License. Permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

Approximate search methods may sacrifice some precision for speed and memory efficiency. The README does not detail specific hardware requirements for optimal GPU performance beyond CUDA compatibility.

Health Check
Last commit

16 hours ago

Responsiveness

Inactive

Pull Requests (30d)
70
Issues (30d)
24
Star History
1,869 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.