ICLR26_Paper_Finder  by wenhangao21

AI paper discovery via semantic abstract search

Created 2 months ago
258 stars

Top 98.1% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

This project offers a semantic search tool for AI research papers, overcoming keyword search limitations by using full paper abstracts. It targets researchers and engineers needing comprehensive paper discovery across major AI venues. The tool provides a user-friendly interface and a tutorial for building custom paper recommenders, democratizing advanced search capabilities.

How It Works

The core approach uses semantic search on paper abstracts for richer context than titles/keywords, enabling nuanced retrieval from AI venues like ICML, ICLR, NeurIPS, and CVPR. Its novelty lies in abstract-based queries and a tutorial for users to build their own paper recommenders in ~30 minutes, offering an accessible path to custom AI research tools.

Quick Start & Requirements

Setup involves Anaconda, creating/activating a Python 3.12 conda environment (conda create -n PaperFinder python=3.12 && conda activate PaperFinder). Pip installs dependencies: gdown, chromadb, gradio, markdown, google-generativeai, sentence_transformers. Download data via gdown (link provided) and unzip. Run with python app.py. Hosting: permanent site (http://ai-paper-finder.info/) and Hugging Face (https://huggingface.co/spaces/wenhanacademia/ai-paper-finder), noting Mainland China access issues for the permanent site.

Highlighted Details

  • Features semantic search across AI venues, including over 17,000 ICLR 2026 submissions.
  • Enables one-click search result downloads and batch PDF downloads via batch_download.py.
  • Includes multi-lingual support, though performance is suboptimal compared to English.
  • Provides a tutorial (Tutorial_Making_Paper_Recommenders.ipynb) for building custom paper recommenders.

Maintenance & Community

Led by PhD student Wenhan Gao, with main contributions from Wenhan Gao and Jingxiang Qu. The team seeks affordable server options and welcomes collaborators. Community engagement is encouraged via GitHub starring and social media sharing (LinkedIn, X).

Licensing & Compatibility

The README does not explicitly state the software license. Terms for commercial use, derivative works, or closed-source linking are unclear and require clarification from maintainers.

Limitations & Caveats

The permanent hosting site is inaccessible in Mainland China without a VPN. Multi-lingual search performance is suboptimal. ICLR 2026 PDF links may become invalid during rebuttal, requiring updates. The tool is currently in beta.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
36 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.