ArXivChatGuru  by redis-developer

RAG app for interacting with ArXiv research papers

created 2 years ago
552 stars

Top 58.8% on sourcepulse

GitHubView on GitHub
Project Summary

ArXivChatGuru enables users to interact with research papers from ArXiv by leveraging LangChain and Redis. This tool is designed for researchers and developers interested in understanding Retrieval Augmented Generation (RAG) systems, offering a practical demonstration of vector databases and semantic caching in a scientific context.

How It Works

The application retrieves papers from ArXiv based on a user-provided topic. These papers are then segmented into smaller chunks, and embeddings are generated for each chunk. Redis serves as a vector database, storing these embeddings for efficient similarity search. When a user asks a question, the system retrieves the most relevant document chunks from Redis and uses them with an OpenAI model to generate an answer, showcasing the RAG process.

Quick Start & Requirements

  • Install: Clone the repository, create and populate .env with OPENAI_API_KEY, then run poetry install --no-root followed by poetry run streamlit run app.py. Alternatively, use docker compose up after setting up .env.
  • Prerequisites: OpenAI API Key, Python 3.x, Poetry.
  • Links: GitHub Repo

Highlighted Details

  • Demonstrates LangChain's ArXiv Loader for direct paper retrieval.
  • Utilizes Redis as both a vector database and a semantic cache for RAG.
  • Provides learning opportunities on context window size, vector distance, and document retrieval impact on RAG performance.
  • Built with Streamlit for an interactive user interface.

Maintenance & Community

This project is from redis-developer, indicating potential backing or focus from the Redis community. Contributions and feedback are welcomed.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is explicitly stated as a learning tool, not a production-ready application, and is not designed for scalability. Chunking is described as "arbitrary."

Health Check
Last commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind).

LightRAG by HKUDS

1.0%
19k
RAG framework for fast, simple retrieval-augmented generation
created 10 months ago
updated 18 hours ago
Feedback? Help us improve.