Discover and explore top open-source AI tools and projects—updated daily.
umbertogriffoRAG chatbot answers questions using context from Markdown files
Top 81.5% on SourcePulse
This project provides a conversational RAG chatbot that answers questions based on a collection of Markdown files. It's designed for users who want to leverage local, open-source LLMs for document-based Q&A, offering features like conversation memory and multiple response synthesis strategies.
How It Works
The chatbot processes Markdown files by splitting them into chunks, generating embeddings using all-MiniLM-L6-v2, and storing them in a Chroma vector database. When a user asks a question, an LLM first rewrites the query for better retrieval. Relevant document chunks are then fetched from Chroma and used as context to generate an answer with a local LLM via llama-cpp-python. It supports conversation memory and offers three response synthesis strategies: Create and Refine, Hierarchical Summarization, and Async Hierarchical Summarization.
Quick Start & Requirements
make setup_cuda (for NVIDIA) or make setup_metal (for macOS Metal).setup_cuda).streamlit run chatbot/chatbot_app.py -- --model <model_name>streamlit run chatbot/rag_chatbot_app.py -- --model <model_name> --k <num_chunks> --synthesis-strategy <strategy>Highlighted Details
llama-cpp-python for efficient local LLM execution with quantization (4-bit precision).RecursiveCharacterTextSplitter from LangChain to avoid adding it as a dependency.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
3 days ago
1 day
pythops
smol-ai
huggingface