Q&A interface for Airflow and Astronomer documentation
Top 99.2% on sourcepulse
This project provides an end-to-end reference implementation for a Retrieval Augmented Generation (RAG) question-answering system, specifically tailored for Apache Airflow and Astronomer documentation. It targets developers and users seeking to build or understand LLM-powered Q&A applications, offering a comprehensive example that includes data ingestion, prompt orchestration, and feedback loops.
How It Works
The system employs a RAG architecture to ensure factual accuracy. Data from various sources (Airflow docs, Astronomer blog, GitHub PRs, Stack Overflow) is ingested, chunked using LangChain, embedded via OpenAI's models, and stored in Weaviate. Prompt orchestration involves generating multiple prompt variations, retrieving relevant documents from Weaviate, reranking them with Cohere Reranker, and finally using GPT-4o for answer generation. Feedback loops are integrated, allowing user ratings and LLM-based quality assessments to refine the system by re-ingesting high-quality Q&A pairs as new data sources.
Quick Start & Requirements
scripts/local_dev.py
).Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
2 weeks ago
1 day