ICLR2019-OpenReviewData by shaohua0116

Data & visualizations for ICLR 2019 OpenReview data, a research paper

Created 7 years ago

387 stars

Top 74.2% on SourcePulse

View on GitHub

2 Experts Love This Project

Aravind Srinivas

Cofounder of Perplexity

Yaowei Zheng

Author of LLaMA-Factory

Project Summary

This repository provides a Jupyter Notebook for crawling and visualizing metadata from ICLR 2019 OpenReview. It's designed for researchers and practitioners interested in analyzing trends, popular topics, and reviewer sentiment within a specific machine learning conference. The project offers insights into factors that might influence paper acceptance and reviewer ratings.

How It Works

The project utilizes Selenium and ChromeDriver to automate web browser interactions for scraping data from the ICLR OpenReview website. It employs a headless browser setup for running on servers without a graphical interface. The crawled data, including abstracts, keywords, and reviewer ratings, is then processed to generate visualizations like word clouds of keywords and distributions of reviewer scores.

Quick Start & Requirements

Installation: pip install pyvirtualdisplay selenium wordcloud imageio
Prerequisites: Python 3.5+, a Linux environment (Ubuntu recommended for provided setup instructions), and Google Chrome.
Setup: Requires installing Google Chrome, ChromeDriver, and Python dependencies. Instructions are provided for Ubuntu.

Highlighted Details

Analyzes ICLR 2019 papers, revealing that accepted papers had an average rating of 6.611, while rejected papers averaged 4.716.
Identifies keywords like "theory," "robustness," and "graph neural network" as potentially correlating with higher reviewer ratings.
Includes a Python function PR to calculate how many papers a given paper "beats" based on average reviewer ratings.

Maintenance & Community

This repository appears to be a personal project from 2019, with no explicit mention of ongoing maintenance or community channels.

Licensing & Compatibility

The repository does not explicitly state a license.

Limitations & Caveats

The data is specific to ICLR 2019 and may not generalize to other conferences or years. The analysis of keywords correlating with ratings is based on a single conference's data and should be interpreted with caution. The setup instructions are specific to Ubuntu.

Health Check

Last Commit

6 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days