vsag  by antgroup

Vector indexing library for similarity search

created 1 year ago
347 stars

Top 81.1% on sourcepulse

GitHubView on GitHub
Project Summary

VSAG is a C++ vector indexing library designed for efficient similarity search, particularly for datasets that exceed available memory. It offers a Python wrapper (pyvsag) and aims to simplify parameter tuning for users unfamiliar with the underlying algorithms.

How It Works

VSAG employs a vector indexing algorithm optimized for speed and scalability. It claims to outperform existing state-of-the-art methods like Glass and HNSWLIB in terms of queries per second (QPS) at high recall rates, as demonstrated on the GIST dataset. The library's design focuses on efficiency for large-scale vector sets.

Quick Start & Requirements

  • Install: pip install pyvsag
  • C++ Integration: Via CMake (FetchContent).
  • Prerequisites: Python 3.11+ for pyvsag. C++ build tools for native integration.
  • Resources: Performance benchmarks were conducted on an AWS r6i.16xlarge instance.
  • Examples: Python and C++ examples are available in the examples directory.

Highlighted Details

  • Achieves >100% QPS improvement over Glass and >300% over HNSWLIB on GIST dataset at 90% recall.
  • Provides parameter generation methods for ease of use.
  • Supports CMake integration for C++ projects.
  • Includes research papers detailing the algorithm and RaBitQ quantization.

Maintenance & Community

  • Developed by Ant Group's Vector Database Team, with community contributions welcomed.
  • Community discussion via Discord.
  • Roadmap includes support for sparse vectors, pluggable quantization, ARM NEON acceleration, GPU acceleration, and optimizer features.

Licensing & Compatibility

  • License: Apache License 2.0.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

The library is primarily C++ with a Python wrapper; native Python performance may differ. Specific hardware configurations were used for benchmarks, and performance may vary. The roadmap indicates ongoing development for key features.

Health Check
Last commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
87
Issues (30d)
56
Star History
99 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.