vectordb  by epsilla-cloud

Vector database management system

Created 2 years ago
865 stars

Top 41.4% on SourcePulse

GitHubView on GitHub
Project Summary

EpsillaDB is an open-source vector database management system designed for high-performance, scalable, and cost-effective similarity search for embedding vectors. It targets developers and researchers working with LLMs and AI applications who need efficient retrieval of information based on semantic meaning. EpsillaDB offers a familiar database interface with vector data types, aiming to simplify the integration of vector search into existing systems.

How It Works

EpsillaDB utilizes advanced academic parallel graph traversal techniques for vector indexing, claiming to achieve 10x faster vector search than HNSW while maintaining over 99.9% precision. Its core is written in C++, supporting features like metadata filtering, hybrid search (dense/sparse vectors), built-in embedding support for natural language queries, and a cloud-native architecture with compute-storage separation.

Quick Start & Requirements

  • Install/Run: Docker is the primary method: docker run --pull=always -d -p 8888:8888 -v /data:/data epsilla/vectordb. Python client: pip install pyepsilla.
  • Prerequisites: Docker, Python 3.x. Building from source requires Ubuntu setup scripts and OAT++ modules.
  • Resources: Requires Docker or local build. Data persistence via volume mounts.
  • Links: Documentation, Discord, Twitter, Blog, YouTube.

Highlighted Details

  • Claims 10x faster vector search than HNSW with >99.9% precision.
  • Supports familiar database concepts (tables, fields) with vectors as a data type.
  • Offers hybrid search, metadata filtering, and built-in embedding capabilities.
  • Provides Python, JavaScript, and Ruby clients, plus a REST API.

Maintenance & Community

The project is actively maintained with links to Discord, Twitter, and a blog for community engagement and updates.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. This requires further investigation for commercial use or closed-source linking.

Limitations & Caveats

The README mentions an "Experimental" Epsilla Cloud DBaaS. The licensing is not specified, which could be a significant blocker for adoption. Building from source involves platform-specific setup scripts.

Health Check
Last Commit

2 weeks ago

Responsiveness

1 week

Pull Requests (30d)
3
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Chang She Chang She(Cofounder of LanceDB), Carol Willing Carol Willing(Core Contributor to CPython, Jupyter), and
11 more.

lancedb by lancedb

0.7%
8k
Embedded retrieval engine for multimodal AI
Created 2 years ago
Updated 4 days ago
Feedback? Help us improve.