findpapers  by jonatasgrosman

CLI tool for academic paper discovery across multiple databases

created 5 years ago
277 stars

Top 94.5% on sourcepulse

GitHubView on GitHub
Project Summary

Findpapers is a command-line tool designed to assist researchers in discovering and managing academic literature. It automates the process of searching multiple scientific databases, refining results, downloading full-text papers, and generating BibTeX citations, streamlining literature review workflows for academics and researchers.

How It Works

The tool queries specified databases (ACM, arXiv, bioRxiv, IEEE, medRxiv, PubMed, Scopus) using a structured query language that supports boolean operators, wildcards, and date ranges. It then allows users to interactively filter, classify, and select papers based on metadata, abstracts, and citation counts. Findpapers also attempts to download full-text PDFs and can handle proxy configurations for paywalled content.

Quick Start & Requirements

  • Install via pip: pip install findpapers
  • Requires Python 3.7+
  • API tokens for IEEE and Scopus may be required for full access to those databases.
  • Official documentation and examples are available within the README.

Highlighted Details

  • Supports searching across seven major academic databases.
  • Advanced query syntax with boolean logic, wildcards, and date filtering.
  • Interactive refinement and classification of search results.
  • Full-text PDF download capability with proxy support.
  • Identifies potentially predatory publications based on Beall's List.
  • Automatically merges duplicate paper entries from different sources.

Maintenance & Community

  • The project is open for contributions, with guidelines provided.
  • A citation is available for academic referencing.

Licensing & Compatibility

  • The repository does not explicitly state a license in the provided README.

Limitations & Caveats

  • PDF download heuristics may not work for all papers, and many are behind paywalls.
  • Specific query syntax rules apply, particularly for bioRxiv and medRxiv, which have stricter limitations on wildcards and boolean operators.
  • The README mentions that formal documentation is still under development.
Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
1
Issues (30d)
0
Star History
18 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.