ollama_pdf_rag by tonykipkemboi

Local RAG pipeline demo for chatting with PDFs

Created 1 year ago

479 stars

Top 63.8% on SourcePulse

Project Summary

This project provides a local Retrieval Augmented Generation (RAG) pipeline for chatting with PDF documents, targeting developers and researchers who need to process and query private data without external services. It offers both a Jupyter notebook for experimentation and a Streamlit web interface for user-friendly interaction, enabling efficient local document analysis.

How It Works

The system leverages LangChain for orchestrating the RAG pipeline, integrating with Ollama for local LLM inference and embedding generation. PDFs are processed with intelligent chunking, and vector embeddings are stored locally using ChromaDB. A multi-query retrieval mechanism enhances context understanding before feeding relevant chunks to the LLM for response generation.

Quick Start & Requirements

Install Ollama: Download from Ollama's website.
Pull Models: ollama pull llama3.2 (or preferred model).
Clone Repo: git clone https://github.com/tonykipkemboi/ollama_pdf_rag.git

Setup Environment:

python -m venv venv
source venv/bin/activate  # or .\venv\Scripts\activate on Windows
pip install -r requirements.txt

Run: python run.py (access at http://localhost:8501) or use updated_rag_notebook.ipynb.
Dependencies: Ollama (0.4.4), Streamlit (1.40.0), LangChain (0.1.20), ChromaDB (0.4.22). CPU-only systems may require manual ONNX Runtime configuration.

Highlighted Details

Fully local processing, ensuring data privacy.
Intelligent PDF chunking and multi-query retrieval.
Streamlit interface for easy interaction and Jupyter notebook for experimentation.
Local LLM and embedding model support via Ollama.

Maintenance & Community

The project is maintained by Tony Kipkemboi. Links to X, LinkedIn, YouTube, and GitHub are provided for community engagement and support.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

Performance on CPU-only systems will be slower. Users may need to adjust chunk size for memory management, particularly on systems with limited resources. Troubleshooting for ONNX DLL errors is documented.

Health Check

Last Commit

3 weeks ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

7 stars in the last 30 days