vectorlite  by 1yefuwang1

SQLite extension for fast vector search

created 1 year ago
319 stars

Top 86.1% on sourcepulse

GitHubView on GitHub
Project Summary

Vectorlite is an in-process, SQL-powered vector search extension for SQLite, designed for developers needing efficient vector search capabilities within their applications without external dependencies. It leverages hnswlib for fast Approximate Nearest Neighbor (ANN) search and offers a familiar SQL interface, making it accessible across various programming languages that support SQLite drivers.

How It Works

Vectorlite functions as a loadable SQLite extension, introducing new SQL functions and a virtual table for vector storage and querying. It utilizes hnswlib for building and querying ANN indexes, enabling rapid similarity searches. A key advantage is its integration with SQLite's query planner, allowing for predicate pushdown of metadata filters (like rowid ranges) directly into the HNSW index traversal, optimizing queries that combine vector similarity and metadata criteria. It also features a custom, SIMD-accelerated vector distance implementation for enhanced performance.

Quick Start & Requirements

  • Install: pip install vectorlite-py apsw numpy (for Python users).
  • Prerequisites: SQLite version >= 3.38 is recommended for metadata filtering. apsw is recommended as the SQLite driver for compatibility with newer SQLite features.
  • Usage: Load the extension using conn.load_extension(vectorlite_py.vectorlite_path()) in apsw.
  • Docs: Examples and Python Bindings.

Highlighted Details

  • Performance: Claims to be significantly faster than sqlite-vec and sqlite-vss for vector queries, with benchmarks showing 3x-100x speedups depending on dataset size and vector dimension.
  • SIMD Acceleration: Implements a fast, portable SIMD-accelerated vector distance calculation using Google's highway library, outperforming hnswlib's implementation for dimensions >= 256.
  • SQL Interface: Supports standard SQL for creating virtual tables, inserting, updating, and deleting vectors, alongside custom functions like vector_distance, vector_from_json, vector_to_json, knn_param, and knn_search.
  • Index Persistence: Allows saving and reloading HNSW indexes to/from files, and loading existing hnswlib indexes.

Maintenance & Community

The project is actively developed, with recent updates focusing on ARM SIMD support. Community interaction channels are not explicitly mentioned in the README.

Licensing & Compatibility

The project is licensed under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

Vectorlite is currently in beta, with potential for breaking changes. It supports only float32 vectors and has limitations on combining multiple rowid or knn_search constraints within a single query (though OR combinations are supported). Deleting vectors marks them as deleted without freeing memory, and the vector index is held entirely in memory.

Health Check
Last commit

9 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
19 stars in the last 90 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX).

pgvector-node by pgvector

0.8%
399
Node.js library for pgvector support
created 4 years ago
updated 2 weeks ago
Feedback? Help us improve.