zotero-arxiv-daily  by TideDra

arXiv paper recommendation tool based on Zotero library context

created 8 months ago
2,164 stars

Top 21.3% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a daily digest of new arXiv papers tailored to a user's research interests, as inferred from their Zotero library. It's designed for researchers and academics seeking to stay updated with relevant literature without manual searching. The primary benefit is automated, personalized paper discovery delivered via email.

How It Works

The system retrieves papers from a user's Zotero library and new arXiv publications from the previous day. It then calculates embeddings for paper abstracts using a sentence transformer model. A relevance score for each new arXiv paper is determined by its weighted similarity to the user's Zotero library papers, with more recent Zotero entries carrying higher weight. Optionally, it generates AI-powered TL;DR summaries using a local or cloud-based LLM, extracting key information from paper abstracts, introductions, and conclusions.

Quick Start & Requirements

  • Install/Run: Fork the repository and configure GitHub Actions secrets. Local execution requires uv and setting environment variables.
  • Prerequisites: Zotero API key and ID, arXiv query categories, SMTP server details for email delivery. For local LLM use, a ~3GB model download is required.
  • Setup: Minimal setup via GitHub Actions secrets. Local setup involves installing uv and setting environment variables.
  • Links: Zotero API, arXiv Categories

Highlighted Details

  • Automated daily email delivery via GitHub Actions.
  • AI-generated TL;DR summaries for papers.
  • Affiliation resolution and PDF/code links included in emails.
  • Configurable Zotero collection exclusion using gitignore-style patterns.

Maintenance & Community

The project is marked as active. Contributions are welcomed via pull requests to the dev branch.

Licensing & Compatibility

  • License: AGPLv3.
  • Compatibility: AGPLv3 is a strong copyleft license. It may impose requirements on derivative works, especially if linked or distributed, potentially requiring source code disclosure for modifications or integrations within closed-source applications.

Limitations & Caveats

The recommendation algorithm is described as simple and may not accurately reflect user interests. Generating TL;DRs locally on GitHub Actions runners can be time-consuming (~70s per paper), potentially exceeding execution time limits with a high MAX_PAPER_NUM.

Health Check
Last commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
2
Issues (30d)
15
Star History
1,447 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.