weaviate by weaviate

Open-source vector database for combining vector search with structured filtering

Created 9 years ago

15,372 stars

Top 3.2% on SourcePulse

View on GitHub

17 Experts Love This Project

Michael Han

Cofounder of Unsloth

Eric Zhu

Coauthor of AutoGen; Research Scientist at Microsoft Research

and 13 more!

Project Summary

Weaviate is an open-source vector database designed for efficient storage and retrieval of object-vector pairs, enabling hybrid search capabilities. It targets software engineers, data engineers, and data scientists building AI-powered applications like chatbots, recommendation systems, and semantic search, offering speed, flexibility, and production-readiness.

How It Works

Weaviate leverages state-of-the-art ML models to convert data (text, images) into searchable vectors. It supports vectorization at import time or allows users to provide their own vectors. Its modular architecture enables integration with numerous third-party services and model hubs (OpenAI, HuggingFace, Cohere) and supports custom modules for tailored functionality. The core engine is optimized for fast nearest neighbor searches, reportedly achieving sub-millisecond response times on millions of objects.

Quick Start & Requirements

Install: Docker is the recommended method for quick setup.
Prerequisites: Docker, Docker Compose.
Resources: Minimal resource requirements for local testing; production deployments require careful consideration of scaling needs.
Links: Quickstart Tutorial, Documentation, Demos

Highlighted Details

10-NN search on millions of objects in milliseconds.
Modules for OpenAI, Cohere, HuggingFace, and custom models.
GraphQL and REST APIs, with a new gRPC API for faster access.
Client libraries available for Python, JavaScript/TypeScript, Go, and Java.

Maintenance & Community

Weaviate is actively developed with a strong community presence. Key community channels include a Community Forum, GitHub, and Slack.

Licensing & Compatibility

Weaviate is licensed under the BSD-3-Clause License. This permissive license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

While production-ready, the README does not detail specific hardware requirements for large-scale deployments or provide explicit benchmarks for all supported modules and configurations. Users should consult the documentation for detailed scaling and performance tuning guidance.

Health Check

Last Commit

20 hours ago

Responsiveness

1 week

Pull Requests (30d)

155

Issues (30d)

Star History

186 stars in the last 30 days