mcp-local-rag  by shinpr

Local RAG for private code and document search

Created 6 months ago
257 stars

Top 98.3% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This project offers a local-first Retrieval Augmented Generation (RAG) server designed for developers. It enables semantic and keyword search across local code and technical documentation, providing a fully private, zero-setup solution that operates offline after an initial model download. The primary benefit is empowering AI assistants with accurate, context-aware access to sensitive local data without external dependencies or costs.

How It Works

The system chunks documents based on semantic similarity rather than fixed character counts, preserving the integrity of code blocks and natural topic boundaries. Text is converted into vector embeddings locally using Transformers.js. Search queries leverage a hybrid approach, combining semantic vector search with a keyword boost that prioritizes exact technical terms like function names or error codes. Results are then filtered by relevance gaps, yielding fewer but more reliable information chunks.

Quick Start & Requirements

  • Primary Install/Run: Execute npx -y mcp-local-rag to start the server or use CLI commands.
  • Prerequisites: Node.js environment (implied by npx). The initial embedding model download (~90MB) requires an internet connection and takes 1-2 minutes; subsequent operation is fully offline.
  • Setup: Zero-friction setup with no Docker, Python, or server management required.
  • Configuration: Integrates with AI coding tools like Cursor, Codex, and Claude Code via specific configuration files or commands. Manual model download is available at https://huggingface.co/Xenova/all-MiniLM-L6-v2.

Highlighted Details

  • Local-first RAG server with hybrid semantic and keyword search.
  • "Zero-friction setup" via npx, eliminating complex dependencies.
  • Fully private and offline operation post-model download.
  • Intelligent semantic chunking preserves document meaning and code integrity.
  • Keyword boost enhances ranking for exact technical terms (e.g., useEffect, class names).
  • Quality-first result filtering by relevance gaps.
  • Optional Agent Skills improve AI assistant query formulation and result interpretation.

Maintenance & Community

No specific details on maintainers, community channels (e.g., Discord, Slack), or active sponsorships were found in the provided README. Development is facilitated via the linked GitHub repository: https://github.com/shinpr/mcp-local-rag.git.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive for personal and commercial use, allowing integration into closed-source applications.

Limitations & Caveats

Embedding generation currently relies on CPU via Transformers.js; GPU acceleration is experimental. The system is designed for single-user, local access and lacks multi-user support or built-in authentication. Supported document formats include PDF, DOCX, TXT, Markdown, and HTML; Excel, PowerPoint, and image formats are not yet supported. Switching embedding models necessitates deleting and re-ingesting the entire vector database due to incompatible vector dimensions.

Health Check
Last Commit

9 hours ago

Responsiveness

Inactive

Pull Requests (30d)
17
Issues (30d)
3
Star History
35 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Simon Willison Simon Willison(Coauthor of Django).

semantra by freedmand

0.0%
3k
CLI tool for semantic document search
Created 3 years ago
Updated 1 year ago
Feedback? Help us improve.