local-LLM-with-RAG by amscotti

Local LLM inference with RAG for document Q&A

Created 2 years ago

250 stars

Top 100.0% on SourcePulse

Project Summary

Summary

This project offers an experimental sandbox for deploying local Large Language Models (LLMs) via Ollama, implementing Retrieval-Augmented Generation (RAG) for question answering against user documents. It targets developers exploring private, on-premises LLM applications, featuring a Streamlit UI for intuitive interaction. The core benefit is enabling RAG workflows locally, enhancing data privacy and control.

How It Works

The system orchestrates a local RAG pipeline using Ollama for LLM inference and embedding generation (nomic-embed-text). Langchain manages the data flow: documents (PDFs, Markdown) are ingested, embedded, and stored in ChromaDB. User queries trigger retrieval of semantically similar document chunks from Chroma. These chunks, combined with the query, are fed to the local LLM for contextually relevant answers. A Streamlit application provides a graphical interface for model selection and document directory management.

Quick Start & Requirements

Requirements: Ollama version 0.5.7 or higher.
Setup: Clone the repository. Install UV (see Astral site). Create a Python virtual environment and run uv sync for dependencies.
Running Project: Execute uv app.py -m <model_name> -p <path_to_documents> (defaults: mistral model, Research directory). Optionally specify embedding model with -e <embedding_model_name> (defaults: nomic-embed-text).
Streamlit UI: Launch via uv streamlit run ui.py.
Initial Execution: Downloads required LLM and embedding models from Ollama; may take time.

Highlighted Details

Implements a complete local RAG pipeline using Ollama for LLM and embeddings.
Utilizes ChromaDB for efficient vector storage and similarity search.
Features a Streamlit web UI for enhanced user interaction.
Command-line interface allows flexible configuration of models and data sources.

Maintenance & Community

The README provides no specific details on maintainers, community support channels (e.g., Discord, Slack), or a public roadmap.

Licensing & Compatibility

The project's license type and compatibility restrictions for commercial use or integration with closed-source projects are not specified in the README.

Limitations & Caveats

This repository is explicitly labeled an "experimental sandbox." Embeddings are reloaded on each application run, a noted inefficiency implemented solely for testing.

local-LLM-with-RAG by amscotti

Explore Similar Projects

RAG-QA-Generator by wangxb96

local_llama by jlonge4

llm-search by snexus

ArXivChatGuru by redis-developer

rlama by DonTizi

layra by liweiphys

localGPT-Vision by PromtEngineer

ChatPDF by shibing624

ollama_pdf_rag by tonykipkemboi

easy-local-rag by AllAboutAI-YT

LangChain-ChatGLM-Webui by X-D-Lab

WeKnora by Tencent