vectra  by Stevenic

Local vector database for Node.js, file-based storage

Created 2 years ago
533 stars

Top 59.4% on SourcePulse

GitHubView on GitHub
Project Summary

Vectra is a local, file-based vector database for Node.js, designed for scenarios requiring fast similarity search on small, static datasets. It offers a developer-friendly alternative to cloud-hosted solutions like Pinecone or Qdrant for use cases such as few-shot learning examples or single-document question answering.

How It Works

Vectra stores indexes as folders on disk, with index.json containing vectors and indexed metadata. Other metadata is stored separately, keyed by GUID. It supports a subset of MongoDB query operators for metadata filtering, followed by similarity ranking. The entire index is loaded into memory for near-instantaneous querying (estimated 1-2ms), making it unsuitable for large, dynamic datasets like chatbot memory.

Quick Start & Requirements

  • Install: npm install vectra
  • Requirements: Node.js, OpenAI API key for embeddings (example uses text-embedding-ada-002).
  • Setup: Minimal, involves creating an index directory and calling createIndex().
  • Docs: https://github.com/Stevenic/vectra

Highlighted Details

  • Local, file-based storage eliminates external dependencies.
  • In-memory indexing for rapid query performance.
  • Supports metadata filtering using MongoDB-like operators.
  • Python bindings (vectra-py) available for cross-language index access.

Maintenance & Community

The project appears to be maintained by Stevenic. No specific community channels or roadmap are detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Vectra loads the entire index into memory, limiting its suitability for large or dynamic datasets. Namespaces are not directly supported but can be emulated by creating separate indexes. The project is presented as a local solution, implying potential scalability limitations compared to distributed databases.

Health Check
Last Commit

5 months ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
1
Star History
11 stars in the last 30 days

Explore Similar Projects

Starred by Bryan Helmig Bryan Helmig(Cofounder of Zapier) and Jared Palmer Jared Palmer(SVP at GitHub; Founder of Turborepo; Author of Formik, TSDX).

pgvector-node by pgvector

0%
413
Node.js library for pgvector support
Created 4 years ago
Updated 1 month ago
Starred by Chang She Chang She(Cofounder of LanceDB), Carol Willing Carol Willing(Core Contributor to CPython, Jupyter), and
11 more.

lancedb by lancedb

0.7%
8k
Embedded retrieval engine for multimodal AI
Created 2 years ago
Updated 3 days ago
Feedback? Help us improve.