Discover and explore top open-source AI tools and projects—updated daily.
Build a production-grade RAG research assistant
Top 54.8% on SourcePulse
This project provides a comprehensive, hands-on course for building a production-grade Retrieval-Augmented Generation (RAG) system, specifically an AI research assistant that curates and answers questions about arXiv papers. It targets AI/ML engineers, software engineers, and data scientists looking to master end-to-end AI application development using industry best practices.
How It Works
The system is architected around a microservices approach orchestrated via Docker Compose. Key components include FastAPI for the API, PostgreSQL for metadata storage, OpenSearch for hybrid search, Apache Airflow for workflow automation, and Ollama for local LLM serving. The data pipeline involves fetching papers from the arXiv API, parsing PDFs using Docling, and storing extracted metadata and content. Future weeks promise implementation of advanced RAG techniques like hybrid search, context-aware chunking, and production deployment.
Quick Start & Requirements
docker compose up --build -d
to start all services.Highlighted Details
Maintenance & Community
The project is developed by Jam With AI. Further community or roadmap details are not explicitly provided in the README.
Licensing & Compatibility
The project is released under the MIT License, permitting commercial use and closed-source linking.
Limitations & Caveats
The project is presented as a course with future weeks (3-6) marked as "Coming Soon," indicating incomplete functionality for the full RAG pipeline and deployment stages.
6 days ago
Inactive