Local RAG pipeline demo for chatting with PDFs
Top 69.4% on sourcepulse
This project provides a local Retrieval Augmented Generation (RAG) pipeline for chatting with PDF documents, targeting developers and researchers who need to process and query private data without external services. It offers both a Jupyter notebook for experimentation and a Streamlit web interface for user-friendly interaction, enabling efficient local document analysis.
How It Works
The system leverages LangChain for orchestrating the RAG pipeline, integrating with Ollama for local LLM inference and embedding generation. PDFs are processed with intelligent chunking, and vector embeddings are stored locally using ChromaDB. A multi-query retrieval mechanism enhances context understanding before feeding relevant chunks to the LLM for response generation.
Quick Start & Requirements
ollama pull llama3.2
(or preferred model).git clone https://github.com/tonykipkemboi/ollama_pdf_rag.git
python -m venv venv
source venv/bin/activate # or .\venv\Scripts\activate on Windows
pip install -r requirements.txt
python run.py
(access at http://localhost:8501
) or use updated_rag_notebook.ipynb
.Highlighted Details
Maintenance & Community
The project is maintained by Tony Kipkemboi. Links to X, LinkedIn, YouTube, and GitHub are provided for community engagement and support.
Licensing & Compatibility
Licensed under the MIT License, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
Performance on CPU-only systems will be slower. Users may need to adjust chunk size for memory management, particularly on systems with limited resources. Troubleshooting for ONNX DLL errors is documented.
1 month ago
1 week