vector-db-benchmark  by qdrant

Framework for vector search engine benchmarking

Created 3 years ago
339 stars

Top 81.2% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a framework for benchmarking various vector search engines, enabling users to compare their performance under identical hardware and scenario constraints. It targets engineers and researchers needing to select the most efficient vector database for their specific use cases, offering objective performance metrics.

How It Works

The framework operates on a server-client model, where each vector database is run as a server via Docker Compose. A separate client instance then executes benchmark scenarios, which can be configured for single or distributed server modes and varying client loads. This approach ensures a consistent testing environment, allowing for direct comparison of engine capabilities.

Quick Start & Requirements

  • Install dependencies: pip install poetry then poetry install.
  • Run servers: cd ./engine/servers/<engine-configuration-name> then docker compose up.
  • Run client: poetry shell then python run.py --engines "qdrant-rps-m- -ef- " --datasets "dbpedia-openai-100K-1536-angular".
  • Requires Docker, Python 3.x, and Poetry.
  • Official documentation: [Not explicitly linked, but structure implies configuration files in configuration/ and datasets in datasets/].

Highlighted Details

  • Supports wildcard matching for engines and datasets for flexible testing.
  • Benchmark results are stored locally in the ./results/ directory.
  • Extensible architecture allows for easy integration of new vector databases by implementing base classes.
  • Configuration files allow fine-tuning of connection, collection, upload, and search parameters per engine.

Maintenance & Community

  • Primarily maintained by Qdrant.
  • No explicit links to community channels or roadmaps are provided in the README.

Licensing & Compatibility

  • The README does not specify a license.

Limitations & Caveats

The project is presented as a framework for benchmarking, but the README does not detail specific benchmarked engines or datasets beyond examples, nor does it provide performance results or comparisons. The absence of a specified license raises concerns about commercial use and compatibility.

Health Check
Last Commit

4 days ago

Responsiveness

1+ week

Pull Requests (30d)
7
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Starred by Chang She Chang She(Cofounder of LanceDB), Carol Willing Carol Willing(Core Contributor to CPython, Jupyter), and
11 more.

lancedb by lancedb

0.7%
8k
Embedded retrieval engine for multimodal AI
Created 2 years ago
Updated 3 days ago
Feedback? Help us improve.