Discover and explore top open-source AI tools and projects—updated daily.
erikbernANN benchmarks for approximate nearest neighbor search algorithms
Top 9.1% on SourcePulse
This project provides a comprehensive benchmarking framework for approximate nearest neighbor (ANN) search libraries, targeting researchers and engineers working with high-dimensional data. It offers standardized datasets, Dockerized environments for each algorithm, and tools for reproducible evaluation, enabling objective comparison of ANN library performance.
How It Works
The framework utilizes pre-generated HDF5 datasets with ground truth for top-100 nearest neighbors. Each ANN library is encapsulated within a Docker container, ensuring consistent execution environments. Benchmarking is performed using Python scripts that orchestrate the indexing, querying, and result collection, with a focus on single-CPU saturation and reproducible parameter tuning.
Quick Start & Requirements
pip install -r requirements.txt followed by python install.py.install.py can take 10-30 minutes. Running benchmarks (run.py) can take days.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project focuses on CPU-based ANN algorithms and datasets that fit in RAM; billion-scale benchmarks are handled by a separate project. GPU support for libraries like FAISS requires local compilation and specific flags. The README mentions results are as of April 2025, implying potential for updates.
4 months ago
1 day
matsui528
unum-cloud
szilard
nmslib
milvus-io
facebookresearch