local-LLM-with-RAG  by amscotti

Local LLM inference with RAG for document Q&A

Created 2 years ago
251 stars

Top 99.8% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This project offers an experimental sandbox for deploying local Large Language Models (LLMs) via Ollama, implementing Retrieval-Augmented Generation (RAG) for question answering against user documents. It targets developers exploring private, on-premises LLM applications, featuring a Streamlit UI for intuitive interaction. The core benefit is enabling RAG workflows locally, enhancing data privacy and control.

How It Works

The system orchestrates a local RAG pipeline using Ollama for LLM inference and embedding generation (nomic-embed-text). Langchain manages the data flow: documents (PDFs, Markdown) are ingested, embedded, and stored in ChromaDB. User queries trigger retrieval of semantically similar document chunks from Chroma. These chunks, combined with the query, are fed to the local LLM for contextually relevant answers. A Streamlit application provides a graphical interface for model selection and document directory management.

Quick Start & Requirements

  • Requirements: Ollama version 0.5.7 or higher.
  • Setup: Clone the repository. Install UV (see Astral site). Create a Python virtual environment and run uv sync for dependencies.
  • Running Project: Execute uv app.py -m <model_name> -p <path_to_documents> (defaults: mistral model, Research directory). Optionally specify embedding model with -e <embedding_model_name> (defaults: nomic-embed-text).
  • Streamlit UI: Launch via uv streamlit run ui.py.
  • Initial Execution: Downloads required LLM and embedding models from Ollama; may take time.

Highlighted Details

  • Implements a complete local RAG pipeline using Ollama for LLM and embeddings.
  • Utilizes ChromaDB for efficient vector storage and similarity search.
  • Features a Streamlit web UI for enhanced user interaction.
  • Command-line interface allows flexible configuration of models and data sources.

Maintenance & Community

The README provides no specific details on maintainers, community support channels (e.g., Discord, Slack), or a public roadmap.

Licensing & Compatibility

The project's license type and compatibility restrictions for commercial use or integration with closed-source projects are not specified in the README.

Limitations & Caveats

This repository is explicitly labeled an "experimental sandbox." Embeddings are reloaded on each application run, a noted inefficiency implemented solely for testing.

Health Check
Last Commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.