RAG for personal knowledge base Q&A
Top 85.2% on sourcepulse
This project provides a personal knowledge base assistant that leverages large language models (LLMs) to answer questions based on provided documentation. It's designed for users who need to efficiently query and retrieve information from extensive, complex datasets, offering a streamlined approach to knowledge management and access.
How It Works
The core of the project is a Retrieval-Augmented Generation (RAG) pipeline built with Langchain. It ingests various document formats (PDF, Markdown, TXT), splits them into manageable chunks, and generates vector embeddings using models like m3e or OpenAI. These embeddings are stored in a Chroma vector database for efficient similarity search. When a user asks a question, the system vectorizes the query, retrieves the most relevant document chunks from the database, and feeds them as context to an LLM (supporting OpenAI, Ernie Bot, Spark, and ChatGLM) to generate a concise answer.
Quick Start & Requirements
conda create -n llm-universe python==3.9.0
), activate it (conda activate llm-universe
), and install dependencies (pip install -r requirements.txt
).cd project/serve
then uvicorn api:app --reload
(Linux) or python api.py
(Windows).cd llm-universe/project/serve
then python run_gradio.py -model_name='chatglm_std' -embedding_model='m3e' -db_path='../../data_base/knowledge_db' -persist_path='../../data_base/vector_db'
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 year ago
1+ week