Discover and explore top open-source AI tools and projects—updated daily.
2dogsandanerdRAG system for private code and document querying
Top 99.3% on SourcePulse
This project provides a production-ready, self-hosted Retrieval-Augmented Generation (RAG) system designed for ingesting and querying codebases and documentation with a focus on privacy and zero configuration. It targets developers and power users needing a private, local LLM-integrated knowledge base solution, offering a robust alternative to cloud-based services.
How It Works
The system leverages a Docker-powered architecture, combining ChromaDB for vector storage and Docling for document ingestion. It employs a hybrid chunking strategy (vector + BM25) to process diverse data types, including PDFs and code repositories, enabling it to differentiate between code and prose. A FastAPI backend exposes CRUD, ingestion, and search APIs, while a modern UI provides dashboards, ingestion tools, and agent configuration capabilities.
Quick Start & Requirements
.env.example to .env and configure settings (e.g., DOCS_DIR, LLM provider), then run docker compose up -d.http://localhost:8080/http://localhost:8080/docshttp://localhost:8080/healthHighlighted Details
.env files or using the CLI, including connectivity testing.OPENAI_BASE_URL, offering flexibility for local model deployment.Maintenance & Community
The author has expressed significant distress regarding alleged plagiarism and has stated they are "out of this game" and will no longer contribute to open source. This indicates a high risk of future maintenance cessation. No community links (Discord, Slack) are provided.
Licensing & Compatibility
The license type is not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking is therefore undetermined.
Limitations & Caveats
The project's future maintenance is highly uncertain due to the author's stated withdrawal from open-source contributions following a dispute. The lack of explicit licensing information poses a significant adoption blocker for commercial or sensitive use cases.
2 months ago
Inactive
mixedbread-ai