ai-localbase  by veyliss

Local-first RAG system for document Q&A and LLM integration

Created 1 month ago
287 stars

Top 91.3% on SourcePulse

GitHubView on GitHub
Project Summary

A local-first, self-hostable Retrieval Augmented Generation (RAG) system designed to integrate local documents with large language models for conversational search. It targets individuals and small teams seeking a private, customizable AI knowledge base, offering support for various document types and flexible model deployment via Ollama or OpenAI-compatible APIs. The system provides a web UI for knowledge base management, document ingestion, and persistent chat history, enabling rapid prototyping and local AI application development.

How It Works

AI LocalBase employs a RAG architecture, leveraging Qdrant as its vector database. Documents (TXT, Markdown, PDF, XLSX, CSV) are processed through text splitting, embedding, and indexing. User queries trigger vector searches in Qdrant, with relevant context dynamically injected into prompts sent to LLMs accessed via Ollama or OpenAI-compatible endpoints. The system features local persistence for chat history (SQLite) and configuration (JSON), alongside advanced retrieval strategies like MMR de-duplication and optional hybrid search for enhanced accuracy.

Quick Start & Requirements

The quickest way to get started is via Docker Compose, using docker compose up --build. For local development, prerequisites include Docker, Go, and Node.js. Users will need to configure chat and embedding models, with examples provided for Ollama and OpenAI-compatible APIs. Detailed setup, environment variables, and API explanations are available in docs/getting-started.md.

Highlighted Details

  • Supports ingestion and querying of TXT, Markdown, PDF (text extraction), XLSX, and CSV file formats.
  • Features a built-in MCP (Multi-Agent Conversation Protocol) Server, enabling external Agent/Tool systems to access its knowledge base, conversation, and retrieval services via HTTP/JSON-RPC.
  • Implements advanced retrieval techniques including text splitting, batch embedding, dynamic candidate recall, keyword coverage re-ranking, MMR de-duplication, embedding caching, and optional semantic caching.
  • Offers flexible model integration supporting both Ollama and any OpenAI-compatible API endpoints for chat and embedding models.

Maintenance & Community

The project includes contribution (CONTRIBUTING.md) and security (SECURITY.md) guidelines, indicating a structured approach to development. Specific details on active contributors, community channels (like Discord/Slack), or sponsorship are not explicitly detailed in the provided README.

Licensing & Compatibility

The project is open-source, with a LICENSE file present. Specific license type and compatibility notes for commercial use or closed-source linking are not detailed in the provided text. It is designed for local and self-hosted environments.

Limitations & Caveats

PDF support is described as "text" based, suggesting that advanced OCR capabilities might not be included or are basic. The README focuses on core functionality and setup; more in-depth details on specific retrieval optimizations, MCP capabilities, and deployment variations are located in separate documentation files.

Health Check
Last Commit

13 hours ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
163 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.