RAG agent for complex question answering
Top 30.0% on sourcepulse
This repository offers a sophisticated Retrieval-Augmented Generation (RAG) agent designed for complex question answering on custom data. It targets users needing to overcome limitations of simple semantic search, providing a controllable, autonomous agent that breaks down and reasons through multi-step queries, aiming to prevent hallucinations and ensure grounded answers.
How It Works
The agent employs a deterministic graph as its core reasoning engine. It processes documents by splitting them into chapters, cleaning text, and generating detailed summaries using LLMs. A database of book quotes is also created. Both content and summaries are encoded into vector stores. Questions are anonymized, a high-level plan is generated, and then de-anonymized into executable tasks. Each task involves retrieving distilled information from vector stores or generating answers via chain-of-thought reasoning, with continuous verification and re-planning based on new context.
Quick Start & Requirements
.env
file with OPENAI_API_KEY
and GROQ_API_KEY
, and install dependencies with pip install -r requirements.txt
. Alternatively, use docker-compose up --build
.sophisticated_rag_agent_harry_potter.ipynb
for a tutorial. Run real-time visualization with streamlit run simulate_agent.py
or via Docker at http://localhost:8501/
.Highlighted Details
Maintenance & Community
The project is maintained by Nir Diamant. Updates and insights are shared via a Substack newsletter. Contributions are welcomed via pull requests or issues.
Licensing & Compatibility
Licensed under the Apache-2.0 License. This permissive license allows for commercial use and integration with closed-source projects.
Limitations & Caveats
The project relies on external LLM APIs, incurring costs and potential rate limits. While designed to prevent hallucinations, the effectiveness is dependent on the quality of the LLM and the grounding data. The "sophisticated deterministic graph" is a conceptual description, and its implementation details within the code would require further investigation.
1 month ago
1 day