vector-db-benchmark by qdrant

Framework for vector search engine benchmarking

Created 3 years ago

344 stars

Top 80.4% on SourcePulse

View on GitHub

3 Experts Love This Project

Andrew Kane

Author of pgvector

Jeff Hammerbacher

Cofounder of Cloudera

Andre Zayarni

Cofounder of Qdrant

Project Summary

This project provides a framework for benchmarking various vector search engines, enabling users to compare their performance under identical hardware and scenario constraints. It targets engineers and researchers needing to select the most efficient vector database for their specific use cases, offering objective performance metrics.

How It Works

The framework operates on a server-client model, where each vector database is run as a server via Docker Compose. A separate client instance then executes benchmark scenarios, which can be configured for single or distributed server modes and varying client loads. This approach ensures a consistent testing environment, allowing for direct comparison of engine capabilities.

Quick Start & Requirements

Install dependencies: pip install poetry then poetry install.
Run servers: cd ./engine/servers/<engine-configuration-name> then docker compose up.
Run client: poetry shell then python run.py --engines "qdrant-rps-m- -ef- " --datasets "dbpedia-openai-100K-1536-angular".
Requires Docker, Python 3.x, and Poetry.
Official documentation: [Not explicitly linked, but structure implies configuration files in configuration/ and datasets in datasets/].

Highlighted Details

Supports wildcard matching for engines and datasets for flexible testing.
Benchmark results are stored locally in the ./results/ directory.
Extensible architecture allows for easy integration of new vector databases by implementing base classes.
Configuration files allow fine-tuning of connection, collection, upload, and search parameters per engine.

Maintenance & Community

Primarily maintained by Qdrant.
No explicit links to community channels or roadmaps are provided in the README.

Licensing & Compatibility

The README does not specify a license.

Limitations & Caveats

The project is presented as a framework for benchmarking, but the README does not detail specific benchmarked engines or datasets beyond examples, nor does it provide performance results or comparisons. The absence of a specified license raises concerns about commercial use and compatibility.

Health Check

Last Commit

6 days ago

Responsiveness

1+ week

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days