Async API for ID-based RAG using Langchain
Top 55.6% on sourcepulse
This project provides an ID-based RAG (Retrieval-Augmented Generation) API using FastAPI, Langchain, and PostgreSQL/pgvector. It's designed for scalable document indexing and retrieval, particularly for use cases requiring file-level embedding management, such as integration with LibreChat. The API enables targeted queries by leveraging file metadata stored in a database.
How It Works
The core approach is to organize document embeddings by file_id
, enabling granular control and targeted retrieval. It utilizes Langchain's vector store capabilities for efficient searching and supports asynchronous operations for enhanced performance. The architecture is built with FastAPI, offering a modern and scalable web framework.
Quick Start & Requirements
pip install -r requirements.txt
docker compose up
for a combined RAG API and PostgreSQL/pgvector setup, or run the API separately with docker compose -f ./api-compose.yaml up
. Local setup requires configuring a .env
file and running uvicorn main:app
.Highlighted Details
sentence-transformers
or TEI), and Ollama.Maintenance & Community
The project is maintained by danny-avila. Further community or roadmap information is not detailed in the README.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project is described as a "simple API" and may evolve, suggesting it might be in early development. Specific limitations regarding supported document types beyond PDFs or advanced RAG techniques are not detailed.
3 weeks ago
1 day