rag-chatbot  by umbertogriffo

RAG chatbot answers questions using context from Markdown files

Created 2 years ago
402 stars

Top 72.0% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a conversational RAG chatbot that answers questions based on a collection of Markdown files. It's designed for users who want to leverage local, open-source LLMs for document-based Q&A, offering features like conversation memory and multiple response synthesis strategies.

How It Works

The chatbot processes Markdown files by splitting them into chunks, generating embeddings using all-MiniLM-L6-v2, and storing them in a Chroma vector database. When a user asks a question, an LLM first rewrites the query for better retrieval. Relevant document chunks are then fetched from Chroma and used as context to generate an answer with a local LLM via llama-cpp-python. It supports conversation memory and offers three response synthesis strategies: Create and Refine, Hierarchical Summarization, and Async Hierarchical Summarization.

Quick Start & Requirements

  • Install: Use make setup_cuda (for NVIDIA) or make setup_metal (for macOS Metal).
  • Prerequisites: Python 3.10+, Poetry 1.7.0, GPU with CUDA 12.1+ (for setup_cuda).
  • Run Chatbot: streamlit run chatbot/chatbot_app.py -- --model <model_name>
  • Run RAG Chatbot: streamlit run chatbot/rag_chatbot_app.py -- --model <model_name> --k <num_chunks> --synthesis-strategy <strategy>
  • Docs: Llama Cpp Python GitHub Issues

Highlighted Details

  • Leverages llama-cpp-python for efficient local LLM execution with quantization (4-bit precision).
  • Supports various open-source LLMs including Llama 3.1, OpenChat, Starling, Phi-3.5, and StableLM.
  • Implements conversation-aware memory and three context synthesis strategies for handling long contexts.
  • Refactored RecursiveCharacterTextSplitter from LangChain to avoid adding it as a dependency.

Maintenance & Community

  • No specific contributors, sponsorships, or community links (Discord/Slack) are mentioned in the README.

Licensing & Compatibility

  • The project does not explicitly state a license in the README.

Limitations & Caveats

  • The README warns that LLMs may generate hallucinations or false information.
  • GPU acceleration on M1 Macs requires using an ARM version of Python; x86 Python will not use the GPU.
Health Check
Last Commit

19 hours ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
16 stars in the last 30 days

Explore Similar Projects

Starred by Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), Tim J. Baek Tim J. Baek(Founder of Open WebUI), and
12 more.

chat-ui by huggingface

0.2%
11k
Chat UI: open-source interface for LLMs
Created 3 years ago
Updated 19 hours ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory).

AstrBot by AstrBotDevs

2.2%
30k
LLM chatbot/framework for multiple platforms
Created 3 years ago
Updated 4 hours ago
Feedback? Help us improve.