rag-demystified by pchunduri6

LLM-powered RAG pipeline for question answering, built from scratch

Created 2 years ago

855 stars

Top 41.9% on SourcePulse

View on GitHub

3 Experts Love This Project

Tobi Lutke

Cofounder of Shopify

Yaowei Zheng

Author of LLaMA-Factory

Jerry Liu

Cofounder of LlamaIndex

Project Summary

This project provides a deep dive into advanced Retrieval-Augmented Generation (RAG) pipelines, specifically demystifying the "Sub-question Query Engine" concept. It targets engineers and researchers seeking transparency into complex RAG systems, offering a practical understanding of their mechanics, limitations, and costs beyond high-level framework abstractions.

How It Works

The core insight is that advanced RAG pipelines, like the Sub-question Query Engine, are fundamentally orchestrated sequences of Large Language Model (LLM) calls. Each step—sub-question generation, retrieval, and response aggregation—is achieved through a single LLM call with a carefully crafted prompt template, context, and the user's question. This approach breaks down complex queries into manageable sub-questions, identifies appropriate data sources and retrieval methods (vector or summary), and aggregates the results, offering a more transparent view than opaque framework abstractions.

Quick Start & Requirements

Install: pip install -r requirements.txt
Setup: echo OPENAI_API_KEY='yourkey' > .env
Run: python complex_qa.py
Prerequisites: OpenAI API Key.

Highlighted Details

Demonstrates how complex questions are broken down into sub-questions, each mapped to a specific data source and retrieval function via LLM prompts.
Highlights the "universal input pattern" for LLM calls: Prompt Template, Context, and Question.
Illustrates challenges including question sensitivity leading to incorrect sub-questions or retrieval methods, and cost variability due to LLM output.
Contrasts its transparent, LLM-call-centric approach with the opacity of frameworks like LlamaIndex.

Maintenance & Community

No specific information on contributors, community channels, or roadmap is provided in the README.

Licensing & Compatibility

The README does not explicitly state a license.

Limitations & Caveats

The system is highly sensitive to user questions, often leading to incorrect sub-question generation or retrieval function selection, requiring significant prompt engineering. Cost estimation is challenging due to the dependency on LLM output and the number of sub-questions.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days