Controllable-RAG-Agent  by NirDiamant

RAG agent for complex question answering

created 1 year ago
1,371 stars

Top 30.0% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository offers a sophisticated Retrieval-Augmented Generation (RAG) agent designed for complex question answering on custom data. It targets users needing to overcome limitations of simple semantic search, providing a controllable, autonomous agent that breaks down and reasons through multi-step queries, aiming to prevent hallucinations and ensure grounded answers.

How It Works

The agent employs a deterministic graph as its core reasoning engine. It processes documents by splitting them into chapters, cleaning text, and generating detailed summaries using LLMs. A database of book quotes is also created. Both content and summaries are encoded into vector stores. Questions are anonymized, a high-level plan is generated, and then de-anonymized into executable tasks. Each task involves retrieving distilled information from vector stores or generating answers via chain-of-thought reasoning, with continuous verification and re-planning based on new context.

Quick Start & Requirements

  • Installation: Clone the repository, set up a .env file with OPENAI_API_KEY and GROQ_API_KEY, and install dependencies with pip install -r requirements.txt. Alternatively, use docker-compose up --build.
  • Prerequisites: Python 3.8+, API key for an LLM provider (OpenAI, Groq, etc.).
  • Usage: Explore sophisticated_rag_agent_harry_potter.ipynb for a tutorial. Run real-time visualization with streamlit run simulate_agent.py or via Docker at http://localhost:8501/.

Highlighted Details

  • Leverages LangChain, FAISS, Streamlit, and Ragas for evaluation.
  • Employs techniques like question anonymization, task decomposition, content distillation, and chain-of-thought reasoning.
  • Incorporates self-reflection and critique mechanisms inspired by Self-RAG.
  • Evaluates performance using Ragas metrics (Answer Correctness, Faithfulness, etc.).

Maintenance & Community

The project is maintained by Nir Diamant. Updates and insights are shared via a Substack newsletter. Contributions are welcomed via pull requests or issues.

Licensing & Compatibility

Licensed under the Apache-2.0 License. This permissive license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

The project relies on external LLM APIs, incurring costs and potential rate limits. While designed to prevent hallucinations, the effectiveness is dependent on the quality of the LLM and the grounding data. The "sophisticated deterministic graph" is a conceptual description, and its implementation details within the code would require further investigation.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
195 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), John Yang John Yang(Author of SWE-bench, SWE-agent), and
7 more.

tree-of-thought-llm by princeton-nlp

0.3%
5k
Research paper implementation for Tree of Thoughts (ToT) prompting
created 2 years ago
updated 6 months ago
Feedback? Help us improve.