vectra  by Stevenic

Local vector database for Node.js, file-based storage

created 2 years ago
502 stars

Top 62.8% on sourcepulse

GitHubView on GitHub
Project Summary

Vectra is a local, file-based vector database for Node.js, designed for scenarios requiring fast similarity search on small, static datasets. It offers a developer-friendly alternative to cloud-hosted solutions like Pinecone or Qdrant for use cases such as few-shot learning examples or single-document question answering.

How It Works

Vectra stores indexes as folders on disk, with index.json containing vectors and indexed metadata. Other metadata is stored separately, keyed by GUID. It supports a subset of MongoDB query operators for metadata filtering, followed by similarity ranking. The entire index is loaded into memory for near-instantaneous querying (estimated 1-2ms), making it unsuitable for large, dynamic datasets like chatbot memory.

Quick Start & Requirements

  • Install: npm install vectra
  • Requirements: Node.js, OpenAI API key for embeddings (example uses text-embedding-ada-002).
  • Setup: Minimal, involves creating an index directory and calling createIndex().
  • Docs: https://github.com/Stevenic/vectra

Highlighted Details

  • Local, file-based storage eliminates external dependencies.
  • In-memory indexing for rapid query performance.
  • Supports metadata filtering using MongoDB-like operators.
  • Python bindings (vectra-py) available for cross-language index access.

Maintenance & Community

The project appears to be maintained by Stevenic. No specific community channels or roadmap are detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Vectra loads the entire index into memory, limiting its suitability for large or dynamic datasets. Namespaces are not directly supported but can be emulated by creating separate indexes. The project is presented as a local solution, implying potential scalability limitations compared to distributed databases.

Health Check
Last commit

2 months ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
1
Star History
34 stars in the last 90 days

Explore Similar Projects

Starred by Matei Zaharia Matei Zaharia(Cofounder of Databricks), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
4 more.

hyperDB by jdagdelen

0%
1k
Local vector database for LLM agent applications
created 2 years ago
updated 5 months ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Anton Troynikov Anton Troynikov(Cofounder of Chroma), and
20 more.

llama_index by run-llama

0.3%
43k
Data framework for building LLM-powered agents
created 2 years ago
updated 23 hours ago
Feedback? Help us improve.