beginner-local-rag-system by jamwithai

Local RAG system for private document querying

Created 1 year ago

305 stars

Top 88.0% on SourcePulse

Project Summary

Summary

This repository provides a complete solution for building a private, offline Retrieval-Augmented Generation (RAG) system. It enables users to manage and query personal documents locally, offering a privacy-friendly alternative to cloud-based solutions. The system targets individuals and privacy-conscious users seeking to leverage LLMs for document analysis without data exposure. Its core benefit is enabling secure, local document interaction powered by advanced AI.

How It Works

The system employs a hybrid approach combining traditional text matching and semantic search via OpenSearch. Document embeddings are generated using Sentence Transformers, facilitating efficient semantic retrieval. These retrieved contexts are then fed to local Large Language Models (LLMs) to generate personalized, context-aware responses. This architecture ensures data privacy by keeping all processing and documents on the user's machine.

Quick Start & Requirements

Installation involves cloning the repository, installing dependencies via pip install -r requirements.txt, configuring constants.py for embedding models and OpenSearch settings, and running the Streamlit application with streamlit run welcome.py.
Prerequisites include a Python environment, OpenSearch, Sentence Transformers models, and local LLMs. Specific hardware requirements (e.g., GPU, CUDA) or Python versions are not detailed.
Links to a two-part blog guide are provided for a detailed walkthrough: Part 1 and Part 2.

Highlighted Details

Enables a fully private, offline RAG system for personal documents.
Features hybrid search capabilities leveraging OpenSearch for both keyword and semantic matching.
Designed for easy integration with local LLMs for customized responses.

Maintenance & Community

No specific details on contributors, sponsorships, or community channels (like Discord/Slack) are provided.

Licensing & Compatibility

The license type is not specified.

Limitations & Caveats

The README does not detail specific hardware requirements, making performance estimation difficult.
The setup relies on manual configuration of constants and external services like OpenSearch and LLMs, potentially requiring significant technical expertise.
The project's description field is explicitly marked as "None".

Health Check

Last Commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

35 stars in the last 30 days

Explore Similar Projects

colette by jolibrain

Multimodal RAG for local document interaction

Created 10 months ago

Updated 1 month ago

mcp-documentation-server by andrea9293

Bridging knowledge gaps with AI-powered document search

Created 8 months ago

Updated 2 months ago

second-brain by henrydaum

Desktop RAG app with multimodal AI and hybrid search

Created 5 months ago

Updated 1 month ago

Starred by

Elie Bursztein

Elie Bursztein(Cybersecurity Lead at Google DeepMind).

local_llama by jlonge4

Local LLM chatbot for documents, runnable offline

Created 2 years ago

Updated 1 year ago

local-LLM-with-RAG by amscotti

Local LLM inference with RAG for document Q&A

Created 2 years ago

Updated 1 month ago

SupabaseAuthWithSSR by ElectricCodeGuy

Next.js template with Supabase auth, RAG, and AI web search

Created 2 years ago

Updated 1 month ago

epstein-docs.github.io by epstein-docs

AI-powered archive for searchable public documents

Created 4 months ago

Updated 4 months ago

eclaire by eclaire-labs

AI assistant for your private data

Created 5 months ago

Updated 1 day ago

-kykms by mahonelau

Document KMS for team knowledge sharing, powered by Elasticsearch

Created 3 years ago

Updated 1 week ago

Starred by

Tobi Lutke

Tobi Lutke(Cofounder of Shopify),

Hursh Agrawal

Hursh Agrawal(Cofounder of The Browser Company), and

12 more.

qmd by tobi

AI-powered local search for your documents

Created 2 months ago

Updated 3 days ago

blinko by blinkospace

Self-hosted personal AI note tool for thought capture and organization

Created 1 year ago

Updated 1 week ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

WeKnora by Tencent

LLM framework for deep document understanding and RAG

Created 7 months ago

Updated 19 hours ago

Feedback? Help us improve.