mcp-local-rag by shinpr

Local RAG for private code and document search

Created 8 months ago

337 stars

Top 81.3% on SourcePulse

Project Summary

Summary

This project offers a local-first Retrieval Augmented Generation (RAG) server designed for developers. It enables semantic and keyword search across local code and technical documentation, providing a fully private, zero-setup solution that operates offline after an initial model download. The primary benefit is empowering AI assistants with accurate, context-aware access to sensitive local data without external dependencies or costs.

How It Works

The system chunks documents based on semantic similarity rather than fixed character counts, preserving the integrity of code blocks and natural topic boundaries. Text is converted into vector embeddings locally using Transformers.js. Search queries leverage a hybrid approach, combining semantic vector search with a keyword boost that prioritizes exact technical terms like function names or error codes. Results are then filtered by relevance gaps, yielding fewer but more reliable information chunks.

Quick Start & Requirements

Primary Install/Run: Execute npx -y mcp-local-rag to start the server or use CLI commands.
Prerequisites: Node.js environment (implied by npx). The initial embedding model download (~90MB) requires an internet connection and takes 1-2 minutes; subsequent operation is fully offline.
Setup: Zero-friction setup with no Docker, Python, or server management required.
Configuration: Integrates with AI coding tools like Cursor, Codex, and Claude Code via specific configuration files or commands. Manual model download is available at https://huggingface.co/Xenova/all-MiniLM-L6-v2.

Highlighted Details

Local-first RAG server with hybrid semantic and keyword search.
"Zero-friction setup" via npx, eliminating complex dependencies.
Fully private and offline operation post-model download.
Intelligent semantic chunking preserves document meaning and code integrity.
Keyword boost enhances ranking for exact technical terms (e.g., useEffect, class names).
Quality-first result filtering by relevance gaps.
Optional Agent Skills improve AI assistant query formulation and result interpretation.

Maintenance & Community

No specific details on maintainers, community channels (e.g., Discord, Slack), or active sponsorships were found in the provided README. Development is facilitated via the linked GitHub repository: https://github.com/shinpr/mcp-local-rag.git.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive for personal and commercial use, allowing integration into closed-source applications.

Limitations & Caveats

Embedding generation currently relies on CPU via Transformers.js; GPU acceleration is experimental. The system is designed for single-user, local access and lacks multi-user support or built-in authentication. Supported document formats include PDF, DOCX, TXT, Markdown, and HTML; Excel, PowerPoint, and image formats are not yet supported. Switching embedding models necessitates deleting and re-ingesting the entire vector database due to incompatible vector dimensions.

mcp-local-rag by shinpr

Explore Similar Projects

similarity-search-kit by ZachNagengast

llm-search by snexus

semantra-python by freedmand

lunr-languages by MihaiValentin

rag-from-scratch by pguso

Local_Pdf_Chat_RAG by weiwill88

Chinese-LangChain by yanqiangmiffy

orama by oramasearch

local-deep-research by LearningCircuit

pdfGPT by bhaskatripathi

qmd by tobi

WeKnora by Tencent