ask-astro  by astronomer

Q&A interface for Airflow and Astronomer documentation

created 1 year ago
255 stars

Top 99.2% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This project provides an end-to-end reference implementation for a Retrieval Augmented Generation (RAG) question-answering system, specifically tailored for Apache Airflow and Astronomer documentation. It targets developers and users seeking to build or understand LLM-powered Q&A applications, offering a comprehensive example that includes data ingestion, prompt orchestration, and feedback loops.

How It Works

The system employs a RAG architecture to ensure factual accuracy. Data from various sources (Airflow docs, Astronomer blog, GitHub PRs, Stack Overflow) is ingested, chunked using LangChain, embedded via OpenAI's models, and stored in Weaviate. Prompt orchestration involves generating multiple prompt variations, retrieving relevant documents from Weaviate, reranking them with Cohere Reranker, and finally using GPT-4o for answer generation. Feedback loops are integrated, allowing user ratings and LLM-based quality assessments to refine the system by re-ingesting high-quality Q&A pairs as new data sources.

Quick Start & Requirements

  • Install/Run: Local development is supported via Python scripts (scripts/local_dev.py).
  • Prerequisites: OpenAI API key, Cohere API key, Weaviate instance.
  • Setup: Local development environment setup is facilitated by provided scripts.
  • Links: Ingest README for source configuration.

Highlighted Details

  • Leverages LangChain for prompt engineering and data processing.
  • Utilizes Weaviate as the vector database.
  • Employs OpenAI's embedding models and GPT-3.5/GPT-4o for LLM calls.
  • Integrates Cohere Reranker for improved document relevance.
  • Includes a Slack bot interface for accessibility.
  • Features a feedback loop mechanism for continuous model improvement.

Maintenance & Community

  • Developed by Astronomer.
  • Contact: ai@astronomer.io for questions and feedback.

Licensing & Compatibility

  • The README does not explicitly state the license.

Limitations & Caveats

  • Requires API keys for OpenAI and Cohere, incurring costs.
  • Relies on a running Weaviate instance.
  • The project is described as a "reference implementation," suggesting potential for further development and refinement.
Health Check
Last commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
30 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.