haiku.rag  by ggozad

RAG on SQLite without external vector databases

created 9 months ago
280 stars

Top 93.9% on sourcepulse

GitHubView on GitHub
Project Summary

This library provides Retrieval-Augmented Generation (RAG) capabilities using only SQLite, eliminating the need for separate vector databases. It targets developers and researchers building AI applications that require efficient, local document querying and question answering, offering a self-contained and versatile solution.

How It Works

The core innovation lies in leveraging sqlite-vec to store and query vector embeddings directly within SQLite. It supports hybrid search by combining vector similarity with full-text search, using Reciprocal Rank Fusion (RRF) for ranking results. This approach simplifies deployment and reduces infrastructure overhead compared to solutions requiring external vector databases.

Quick Start & Requirements

  • Install via pip: pip install haiku.rag
  • Usage: haiku-rag add "Your content here", haiku-rag search "query", haiku-rag ask "question"
  • Supports multiple embedding providers (Ollama, OpenAI, VoyageAI) and QA providers (Ollama, OpenAI, Anthropic).
  • Can be run as a server with file monitoring.
  • Full documentation available at: https://ggozad.github.io/haiku.rag/

Highlighted Details

  • Supports over 40 file formats, including PDF, DOCX, audio, and URLs.
  • Offers a CLI and Python API for integration.
  • Can be exposed as tools for AI assistants (e.g., Claude Desktop) via an MCP server.
  • Implements hybrid search combining vector and full-text search with RRF.

Maintenance & Community

The project is maintained by ggozad. Community and support channels are not explicitly mentioned in the README.

Licensing & Compatibility

The README does not specify a license. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

The README does not mention any specific limitations, known bugs, or deprecation warnings. The absence of a specified license may pose a caveat for commercial adoption.

Health Check
Last commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
11
Issues (30d)
4
Star History
275 stars in the last 90 days

Explore Similar Projects

Starred by John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), Zhiqiang Xie Zhiqiang Xie(Author of SGLang), and
7 more.

milvus by milvus-io

0.4%
36k
Cloud-native vector database for scalable ANN search
created 5 years ago
updated 1 day ago
Feedback? Help us improve.