azure-open-ai-embeddings-qna  by ruoccofabrizio

Web app for OpenAI-enabled document Q&A using Azure

created 2 years ago
853 stars

Top 42.8% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a reference architecture for building an enterprise-grade Question Answering system using Azure OpenAI and Retrieval-Augmented Generation (RAG). It targets developers and organizations looking to implement document search and QnA capabilities, offering a deployable solution that integrates various Azure services for robust document understanding and response generation.

How It Works

The application implements the RAG pattern by first generating embeddings for documents using Azure OpenAI's embedding models. When a user asks a question, the system retrieves the most relevant document chunks based on vector similarity. These chunks are then provided as context to a GPT model (GPT-3, GPT-3.5, or GPT-4) to generate a concise answer, ensuring responses are grounded in the provided documents.

Quick Start & Requirements

  • Local Docker: git clone the repository, configure .env from .env.template, and run docker compose up.
  • Prerequisites: Azure subscription, Azure OpenAI resource with deployed models (embeddings and instruction/chat), potentially Azure Cognitive Search, Azure Cache for Redis Enterprise, Form Recognizer, and Translator resources depending on the deployment option.
  • Setup: Local Docker setup is estimated to take minutes once prerequisites are met. Azure deployments involve configuring Azure resources.
  • Links: Reference Architecture, Educational Blog Post, Azure OpenAI API Sample

Highlighted Details

  • Supports multiple vector stores: Azure Cognitive Search, Azure Cache for Redis Enterprise, and Azure PostgreSQL (PGVector).
  • Offers flexible deployment options: Azure Web App, Docker, and local Python environments.
  • Integrates optional features like Form Recognizer for OCR and document text extraction, and Translator for multilingual support.
  • Provides a backend API for direct QnA queries, supporting conversational history.

Maintenance & Community

The repository is maintained by ruoccofabrizio. Further community or maintenance details are not explicitly provided in the README.

Licensing & Compatibility

The repository's licensing is not explicitly stated in the README. The disclaimer indicates it's for informational purposes and not a production-ready medical device, and that partners/customers are responsible for regulatory compliance.

Limitations & Caveats

The README notes a change in data format, providing a specific Docker image tag (fruocco/oai-embeddings:2023-03-27_25) for compatibility with older applications. Azure Cognitive Search vector search requires signing up for a private preview.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.