MyScaleDB  by myscale

SQL vector database for building scalable AI apps

created 1 year ago
977 stars

Top 38.6% on sourcepulse

GitHubView on GitHub
Project Summary

MyScaleDB is an open-source SQL vector database built on ClickHouse, designed for developers building AI applications. It offers high-performance vector search, full-text search, and SQL-vector join capabilities, enabling efficient management and querying of massive datasets using familiar SQL syntax.

How It Works

MyScaleDB leverages ClickHouse's columnar OLAP architecture, enhancing it with advanced vector algorithms like SCANN for efficient similarity search. This approach allows for unified management of structured, text, and vector data, facilitating complex queries such as filtered vector searches and hybrid text/vector searches. The integration within ClickHouse provides millisecond latency on billion-scale vectors and robust scalability.

Quick Start & Requirements

  • Docker: docker run --name myscaledb --net=host myscale/myscaledb:1.8.0
  • Prerequisites: Docker, Ubuntu 22.04 (for building from source), LLVM 15.0.7, Rust, Cargo, Yasm.
  • Resources: Minimum 32GB RAM and 16 CPUs recommended for Docker Compose deployment.
  • Docs: Vector Search Documentation

Highlighted Details

  • Combines vector search with rich metadata filtering and full-text search for improved RAG accuracy.
  • Offers SQL-compatible interface, eliminating the need for new query languages.
  • Unifies SQL database, vector database, and full-text search engine into a single system.
  • Supports SCANN and HNSW vector index types.

Maintenance & Community

Licensing & Compatibility

  • Licensed under Apache License 2.0.
  • Permissive license suitable for commercial use and integration with closed-source applications.

Limitations & Caveats

The README mentions MSTG algorithm is provided through MyScale Cloud, implying it might not be available in the open-source version. Building from source requires specific Ubuntu versions and LLVM versions, indicating potential build complexities.

Health Check
Last commit

5 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
1
Star History
23 stars in the last 90 days

Explore Similar Projects

Starred by John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), Zhiqiang Xie Zhiqiang Xie(Author of SGLang), and
7 more.

milvus by milvus-io

0.4%
36k
Cloud-native vector database for scalable ANN search
created 5 years ago
updated 21 hours ago
Feedback? Help us improve.