VectorDBBench  by zilliztech

Benchmark tool for vector database performance/cost analysis

created 2 years ago
820 stars

Top 44.2% on sourcepulse

GitHubView on GitHub
Project Summary

VectorDBBench is a Python-based tool designed to benchmark the performance and cost-effectiveness of various vector databases. It caters to users, from researchers to engineers, seeking to compare and select optimal vector database solutions by providing reproducible benchmark results and intuitive visualizations.

How It Works

VectorDBBench simulates real-world production scenarios by executing diverse test cases, including insertion, searching, and filtered searching. It utilizes public datasets like SIFT, GIST, Cohere, and OpenAI-generated data, offering cost-effectiveness reports for cloud services. The tool supports a wide range of vector databases through a flexible client module, allowing users to easily add support for new systems.

Quick Start & Requirements

  • Install: pip install vectordb-bench or pip install vectordb-bench[all] for all clients.
  • Requirements: Python >= 3.11.
  • Run: init_bench or vectordbbench [COMMAND] [ARGS]....
  • Documentation: https://zilliz.com/benchmark

Highlighted Details

  • Supports 15 benchmark cases across various dataset sizes and filtering rates.
  • Offers cost-effectiveness reporting for cloud-based vector databases.
  • Includes a leaderboard for comparing performance metrics (QPS, QP$, Latency).
  • Allows customization of test cases with local datasets.

Maintenance & Community

Sponsored by Zilliz, the creators of Milvus. The project is open-source and encourages community contributions.

Licensing & Compatibility

The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README mentions that some systems may fail to complete all tests due to issues like Out of Memory (OOM) or timeouts, and these occurrences are noted in the results. Timeout values are defined for different test cases to ensure practicality.

Health Check
Last commit

3 days ago

Responsiveness

1 day

Pull Requests (30d)
18
Issues (30d)
7
Star History
122 stars in the last 90 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX).

pgvector-node by pgvector

0.5%
399
Node.js library for pgvector support
created 4 years ago
updated 2 weeks ago
Feedback? Help us improve.