faiss  by facebookresearch

Similarity search library for dense vectors

Created 8 years ago
37,146 stars

Top 0.9% on SourcePulse

GitHubView on GitHub
Project Summary

Faiss is a C++ library with Python bindings for efficient similarity search and clustering of dense vectors, targeting researchers and engineers working with large-scale vector datasets. It offers algorithms that can handle datasets exceeding available RAM, providing significant speedups for nearest neighbor searches.

How It Works

Faiss implements various indexing structures and search algorithms, including exact search baselines and approximate methods that use compressed vector representations. These methods trade search precision for memory efficiency and speed, enabling scaling to billions of vectors. It supports L2 (Euclidean) distance, dot product, and cosine similarity metrics. The library also features GPU acceleration via CUDA and ROCm, offering substantial performance gains for both exact and approximate search.

Quick Start & Requirements

  • Install: pip install faiss-cpu or pip install faiss-gpu (for GPU support).
  • Prerequisites: Python 3.x, NumPy. GPU version requires CUDA-compatible hardware and drivers.
  • Documentation: Full documentation, tutorials, and FAQs are available on the Faiss wiki.

Highlighted Details

  • Offers a wide range of indexing methods, balancing search time, quality, memory usage, and training time.
  • GPU implementation provides highly optimized nearest neighbor search and k-means clustering.
  • Supports both CPU and GPU memory for input/output, with automatic memory management for GPU operations.
  • Scales to billions of vectors, even on a single server, by using compressed vector representations.

Maintenance & Community

Developed primarily by Meta's Fundamental AI Research group. Community discussions and questions are hosted on GitHub Discussions.

Licensing & Compatibility

MIT License. Permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

Approximate search methods may sacrifice some precision for speed and memory efficiency. The README does not detail specific hardware requirements for optimal GPU performance beyond CUDA compatibility.

Health Check
Last Commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)
39
Issues (30d)
12
Star History
513 stars in the last 30 days

Explore Similar Projects

Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Simon Willison Simon Willison(Coauthor of Django), and
1 more.

faiss_tips by matsui528

0.2%
622
Faiss tips and tricks
Created 7 years ago
Updated 2 weeks ago
Feedback? Help us improve.