weaviate  by weaviate

Open-source vector database for combining vector search with structured filtering

created 9 years ago
14,056 stars

Top 3.6% on sourcepulse

GitHubView on GitHub
Project Summary

Weaviate is an open-source vector database designed for efficient storage and retrieval of object-vector pairs, enabling hybrid search capabilities. It targets software engineers, data engineers, and data scientists building AI-powered applications like chatbots, recommendation systems, and semantic search, offering speed, flexibility, and production-readiness.

How It Works

Weaviate leverages state-of-the-art ML models to convert data (text, images) into searchable vectors. It supports vectorization at import time or allows users to provide their own vectors. Its modular architecture enables integration with numerous third-party services and model hubs (OpenAI, HuggingFace, Cohere) and supports custom modules for tailored functionality. The core engine is optimized for fast nearest neighbor searches, reportedly achieving sub-millisecond response times on millions of objects.

Quick Start & Requirements

  • Install: Docker is the recommended method for quick setup.
  • Prerequisites: Docker, Docker Compose.
  • Resources: Minimal resource requirements for local testing; production deployments require careful consideration of scaling needs.
  • Links: Quickstart Tutorial, Documentation, Demos

Highlighted Details

  • 10-NN search on millions of objects in milliseconds.
  • Modules for OpenAI, Cohere, HuggingFace, and custom models.
  • GraphQL and REST APIs, with a new gRPC API for faster access.
  • Client libraries available for Python, JavaScript/TypeScript, Go, and Java.

Maintenance & Community

Weaviate is actively developed with a strong community presence. Key community channels include a Community Forum, GitHub, and Slack.

Licensing & Compatibility

Weaviate is licensed under the BSD-3-Clause License. This permissive license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

While production-ready, the README does not detail specific hardware requirements for large-scale deployments or provide explicit benchmarks for all supported modules and configurations. Users should consult the documentation for detailed scaling and performance tuning guidance.

Health Check
Last commit

22 hours ago

Responsiveness

Inactive

Pull Requests (30d)
274
Issues (30d)
94
Star History
931 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Nat Friedman Nat Friedman(Former CEO of GitHub), and
32 more.

llama.cpp by ggml-org

0.4%
84k
C/C++ library for local LLM inference
created 2 years ago
updated 12 hours ago
Feedback? Help us improve.