vector-db-benchmark  by qdrant

Framework for vector search engine benchmarking

created 3 years ago
330 stars

Top 84.0% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a framework for benchmarking various vector search engines, enabling users to compare their performance under identical hardware and scenario constraints. It targets engineers and researchers needing to select the most efficient vector database for their specific use cases, offering objective performance metrics.

How It Works

The framework operates on a server-client model, where each vector database is run as a server via Docker Compose. A separate client instance then executes benchmark scenarios, which can be configured for single or distributed server modes and varying client loads. This approach ensures a consistent testing environment, allowing for direct comparison of engine capabilities.

Quick Start & Requirements

  • Install dependencies: pip install poetry then poetry install.
  • Run servers: cd ./engine/servers/<engine-configuration-name> then docker compose up.
  • Run client: poetry shell then python run.py --engines "qdrant-rps-m- -ef- " --datasets "dbpedia-openai-100K-1536-angular".
  • Requires Docker, Python 3.x, and Poetry.
  • Official documentation: [Not explicitly linked, but structure implies configuration files in configuration/ and datasets in datasets/].

Highlighted Details

  • Supports wildcard matching for engines and datasets for flexible testing.
  • Benchmark results are stored locally in the ./results/ directory.
  • Extensible architecture allows for easy integration of new vector databases by implementing base classes.
  • Configuration files allow fine-tuning of connection, collection, upload, and search parameters per engine.

Maintenance & Community

  • Primarily maintained by Qdrant.
  • No explicit links to community channels or roadmaps are provided in the README.

Licensing & Compatibility

  • The README does not specify a license.

Limitations & Caveats

The project is presented as a framework for benchmarking, but the README does not detail specific benchmarked engines or datasets beyond examples, nor does it provide performance results or comparisons. The absence of a specified license raises concerns about commercial use and compatibility.

Health Check
Last commit

1 week ago

Responsiveness

1+ week

Pull Requests (30d)
5
Issues (30d)
0
Star History
13 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.