legit-rag  by Emissary-Tech

Modular RAG system for production use

Created 7 months ago
265 stars

Top 96.6% on SourcePulse

GitHubView on GitHub
Project Summary

Legit-RAG is a production-ready, modular Retrieval-Augmented Generation (RAG) pipeline designed for developers and researchers building AI-powered question-answering systems. It offers a structured, extensible framework for implementing a 5-step RAG workflow, leveraging FastAPI, Qdrant, and OpenAI for efficient and intelligent information retrieval and response generation.

How It Works

The system orchestrates a five-stage RAG process: Query Routing intelligently determines if a query can be answered, needs clarification, or should be rejected using an LLM. Query Reformulation refines the input for better retrieval, often extracting keywords for hybrid search. Context Retrieval performs a hybrid search, combining semantic (vector) and keyword-based methods, currently utilizing Qdrant for vector storage. A Completion Check evaluates the sufficiency of retrieved context against a configurable threshold, returning a confidence score. Finally, Answer Generation produces a response using the retrieved context, including citations and confidence scoring. This modular design, based on abstract base classes, facilitates easy extension for different LLM providers, vector databases, and search strategies.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Prerequisites: Python 3.10+, Docker and Docker Compose, OpenAI API key.
  • Setup: Clone repo, create virtual environment, install dependencies, copy and populate .env with OpenAI API key.
  • Run: docker-compose up -d (API at http://localhost:8000, Qdrant at http://localhost:6333).
  • Docs: Swagger UI at http://localhost:8000/docs.

Highlighted Details

  • Implements a 5-step RAG workflow: Query Routing, Reformulation, Context Retrieval, Completion Check, and Answer Generation.
  • Supports hybrid search combining semantic and keyword-based retrieval.
  • Extensible architecture with abstract base classes for LLM providers and vector databases.
  • Utilizes FastAPI for API development and Qdrant for vector storage.

Maintenance & Community

The repository is maintained by Emissary-Tech. Further community or roadmap information is not detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The system currently relies exclusively on OpenAI for LLM interactions and Qdrant for vector storage, although extensibility is planned. Streaming responses and additional vector database implementations are listed as future enhancements, indicating they are not yet available.

Health Check
Last Commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research) and Andre Zayarni Andre Zayarni(Cofounder of Qdrant).

kernel-memory by microsoft

0.2%
2k
RAG architecture for indexing and querying data using LLMs
Created 2 years ago
Updated 1 day ago
Feedback? Help us improve.