Scientific research assistant for answering research queries
Top 67.7% on sourcepulse
OpenResearcher is an AI-powered scientific research assistant designed to answer queries using the arXiv corpus. It targets researchers and power users seeking accelerated access to the latest scientific insights, offering a competitive alternative to existing RAG systems.
How It Works
OpenResearcher employs a Retrieval-Augmented Generation (RAG) architecture. It leverages both Qdrant for vector search of paper content and Elasticsearch for metadata retrieval. This dual-vector-store approach aims to provide richer and more relevant answers by combining semantic similarity with structured metadata search, outperforming other RAG systems in human and GPT-4 evaluations for correctness, richness, and relevance.
Quick Start & Requirements
python=3.10
), activate it, cd
into the directory, and run pip install -r requirements.txt
.docker pull qdrant/qdrant
and docker run ...
) and Elasticsearch (via Docker)./data
.streamlit run ui_app.py
.Highlighted Details
Maintenance & Community
The project is associated with GAIR-NLP and has had its paper accepted by EMNLP Demo Track 2024. Further community or roadmap information is not detailed in the README.
Licensing & Compatibility
The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The setup process is complex, requiring the installation and configuration of multiple external services like Qdrant and Elasticsearch. The README does not specify the exact hardware requirements for running the models or indexing the data, which can be substantial.
9 months ago
1 week