VectorDBBench by zilliztech

Benchmark tool for vector database performance/cost analysis

Created 2 years ago

1,024 stars

Top 36.3% on SourcePulse

View on GitHub

2 Experts Love This Project

Chaoyu Yang

Founder of Bento

Xiaofan Luan

VP Engineering at Zilliz

Project Summary

VectorDBBench is a Python-based tool designed to benchmark the performance and cost-effectiveness of various vector databases. It caters to users, from researchers to engineers, seeking to compare and select optimal vector database solutions by providing reproducible benchmark results and intuitive visualizations.

How It Works

VectorDBBench simulates real-world production scenarios by executing diverse test cases, including insertion, searching, and filtered searching. It utilizes public datasets like SIFT, GIST, Cohere, and OpenAI-generated data, offering cost-effectiveness reports for cloud services. The tool supports a wide range of vector databases through a flexible client module, allowing users to easily add support for new systems.

Quick Start & Requirements

Install: pip install vectordb-bench or pip install vectordb-bench[all] for all clients.
Requirements: Python >= 3.11.
Run: init_bench or vectordbbench [COMMAND] [ARGS]....
Documentation: https://zilliz.com/benchmark

Highlighted Details

Supports 15 benchmark cases across various dataset sizes and filtering rates.
Offers cost-effectiveness reporting for cloud-based vector databases.
Includes a leaderboard for comparing performance metrics (QPS, QP$, Latency).
Allows customization of test cases with local datasets.

Maintenance & Community

Sponsored by Zilliz, the creators of Milvus. The project is open-source and encourages community contributions.

Licensing & Compatibility

The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README mentions that some systems may fail to complete all tests due to issues like Out of Memory (OOM) or timeouts, and these occurrences are noted in the results. Timeout values are defined for different test cases to ensure practicality.

Health Check

Last Commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

31 stars in the last 30 days