chromem-go  by philippgille

Embeddable vector database for Go, Chroma-like interface, zero dependencies

created 1 year ago
648 stars

Top 52.4% on sourcepulse

GitHubView on GitHub
Project Summary

chromem-go is an embeddable vector database for Go applications, designed for simplicity and performance in common use cases like Retrieval Augmented Generation (RAG). It offers a Chroma-like interface, zero third-party dependencies, and in-memory operation with optional persistence, allowing developers to integrate advanced embedding-based features without managing a separate database service.

How It Works

chromem-go operates as an embedded library, similar to SQLite, eliminating the need for a client-server architecture. It stores documents and their corresponding embeddings, supporting various embedding providers (OpenAI, Ollama, etc.) and custom implementations. Queries perform exhaustive nearest neighbor search using cosine similarity, with options for metadata and document content filtering.

Quick Start & Requirements

  • Install: go get github.com/philippgille/chromem-go@latest
  • Prerequisites: OpenAI API key (if using default embedding provider), Go environment.
  • Example: See the provided Go code snippet for a minimal RAG setup. Official examples for RAG and semantic search are available.

Highlighted Details

  • Zero third-party dependencies.
  • Embeddable, no separate DB required.
  • Multithreaded processing for concurrent operations.
  • Supports multiple embedding providers (hosted and local) and custom functions.
  • Exhaustive nearest neighbor search with cosine similarity.
  • In-memory storage with optional gob-encoded persistence and AES-GCM encrypted backups.

Maintenance & Community

The project is under active development (beta, pre-v1.0.0) with a changelog documenting all changes.

Licensing & Compatibility

The project is licensed under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

The project is in beta and may introduce breaking changes before version 1.0.0. Current similarity search is limited to exhaustive (brute-force) methods; Approximate Nearest Neighbor (ANN) search (HNSW, IVFFlat) is planned for the roadmap.

Health Check
Last commit

1 week ago

Responsiveness

1 week

Pull Requests (30d)
5
Issues (30d)
3
Star History
121 stars in the last 90 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP of AI at Vercel; Founder of Turborepo; Author of Formik, TSDX).

pgvector-node by pgvector

0.8%
399
Node.js library for pgvector support
created 4 years ago
updated 2 weeks ago
Feedback? Help us improve.