vectordb  by jina-ai

Python vector database for semantic similarity search

Created 2 years ago
631 stars

Top 52.5% on SourcePulse

GitHubView on GitHub
Project Summary

A Pythonic vector database designed for simplicity and scalability, offering core CRUD operations and flexible deployment options from local to cloud environments. It targets developers needing a lean yet powerful solution for managing and searching vector embeddings, leveraging DocArray for search logic and Jina for scalable index serving.

How It Works

vectordb utilizes DocArray as its core engine for vector search logic, enabling efficient Approximate Nearest Neighbor (ANN) and Exact Nearest Neighbor (ENN) searches. Jina provides the underlying infrastructure for scalable index serving, supporting sharding and replication for high availability and throughput. This architecture allows vectordb to function as a standalone library or be served as a scalable service via gRPC, HTTP, or WebSockets.

Quick Start & Requirements

  • Install: pip install vectordb
  • Prerequisites: Python 3.x, NumPy. HNSWVectorDB requires HNSWLib.
  • Setup: Local setup involves defining a BaseDoc schema with DocArray and initializing a database class (e.g., InMemoryExactNNVectorDB, HNSWVectorDB).
  • Docs: https://docs.jina.ai/concepts/vectordb/

Highlighted Details

  • Offers both Exact NN (InMemoryExactNNVectorDB) and Approximate NN (HNSWVectorDB) search capabilities.
  • Supports serving as a service via gRPC, HTTP, and WebSocket protocols.
  • Provides sharding and replication for scalability and availability.
  • Integrates with Jina AI Cloud for seamless cloud deployment.

Maintenance & Community

Licensing & Compatibility

  • License: Apache-2.0.
  • Compatibility: Permissive license suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

Currently, Jina AI Cloud deployments are limited to 1 replica; support for N replicas in the cloud is under development. The roadmap indicates plans for more ANN algorithms and enhanced filtering capabilities.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Mike Krieger Mike Krieger(CPO at Anthropic; Cofounder of Instagram), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
25 more.

redis by redis

0.1%
71k
Redis is a versatile data structure server, cache, and query engine
Created 16 years ago
Updated 3 days ago
Feedback? Help us improve.