Discover and explore top open-source AI tools and projects—updated daily.
Python package for local, embeddings-based text retrieval
Top 46.3% on SourcePulse
A minimal Python package for local, end-to-end text retrieval using embeddings and vector search. It's designed for low latency and small memory footprints, powering AI features in Kagi Search. The target audience includes developers needing efficient, self-contained semantic search capabilities.
How It Works
VectorDB stores text content, automatically chunking long documents. It associates optional metadata with each chunk and uses configurable embedding models (e.g., BAAI, Universal Sentence Encoder, custom HuggingFace models) to generate vector representations. Retrieval is performed via semantic search, returning the most relevant chunks based on query embeddings. For performance, it leverages Faiss for smaller datasets and mrpt for larger ones.
Quick Start & Requirements
pip install vectordb2
Highlighted Details
memory_file
) and controlling search result diversity (batch_results
).Maintenance & Community
The project is associated with Kagi Search. Further community or roadmap information is not detailed in the README.
Licensing & Compatibility
Limitations & Caveats
The README does not specify limitations regarding maximum data size or potential performance bottlenecks on extremely large datasets beyond the Faiss/mrpt optimization. The project appears to be actively used within Kagi Search.
11 months ago
Inactive