ArXiv paper summarization and personalized recommendation tool
Top 77.5% on sourcepulse
This repository provides a personalized daily digest of new arXiv papers, curated using large language models based on user-defined research interests. It targets researchers and academics overwhelmed by the volume of daily publications, offering a more efficient way to discover relevant literature.
How It Works
The system leverages GPT-3.5-turbo-16k to rank papers by relevance to user-specified natural language descriptions of their interests. It fetches abstracts for papers within chosen arXiv categories, processes them through the LLM for scoring, and generates an HTML digest. This approach automates the tedious task of manual filtering, saving significant time for researchers.
Quick Start & Requirements
pip install -r src/requirements.txt gradio
then python src/app.py
OPENAI_API_KEY
, SENDGRID_API_KEY
, FROM_EMAIL
, and TO_EMAIL
as repository secrets.Highlighted Details
config.yaml
and .env
file for secrets.Maintenance & Community
The project is actively maintained, with a roadmap indicating plans for author-based ranking and support for open-source LLMs. Contributions are encouraged via pull requests.
Licensing & Compatibility
The repository does not explicitly state a license in the provided README. Users should verify licensing for commercial use or integration into closed-source projects.
Limitations & Caveats
Currently relies on OpenAI's proprietary models; support for open-source models is planned but not yet implemented. Email delivery requires a SendGrid account and API key.
1 year ago
1 day