GPT-GNN by acbull

Graph pre-training framework for initializing GNNs

Created 5 years ago

498 stars

Top 62.4% on SourcePulse

View on GitHub

2 Experts Love This Project

Project Summary

GPT-GNN provides a framework for generative pre-training of Graph Neural Networks (GNNs), enabling initialization of GNNs for large-scale and heterogeneous graphs. It is targeted at researchers and practitioners working with complex graph data who seek to improve downstream task performance through self-supervised pre-training.

How It Works

GPT-GNN employs a generative pre-training approach, where GNNs are trained to reconstruct masked or corrupted graph attributes and edges. This self-supervised objective allows the model to learn rich representations of graph structure and node features without explicit labels. The framework supports both attribute generation (using text or pre-trained embeddings) and edge generation tasks, offering flexibility in representation learning.

Quick Start & Requirements

Install dependencies via pip install -r requirements.txt.
Requires PyTorch 1.3.0, PyTorch Geometric 1.3.2, and specific torch-cluster, torch-scatter, and torch-sparse versions.
Datasets (OAG, Reddit) and pre-trained word2vec models need to be downloaded separately.
Pre-training command example: python pretrain_OAG.py --attr_type text --conv_name hgt --n_layers 3 --pretrain_model_dir /datadrive/models/gta_all_cs3
Fine-tuning command example: python finetune_OAG_PF.py --use_pretrain --pretrain_model_dir /datadrive/models/gta_all_cs3 --n_layer 3 --data_percentage 0.1
Official pre-trained models are available for OAG-CS and Reddit.

Highlighted Details

Supports heterogeneous graphs (OAG) and homogeneous graphs (Reddit).
Implements various GNN architectures, with HGT highlighted.
Offers configurable pre-training tasks (attribute/edge generation) and hyperparameters.
Provides example scripts for both pre-training and fine-tuning on downstream tasks.

Maintenance & Community

The project is associated with the KDD'20 paper "Generative Pre-Training of Graph Neural Networks." No specific community channels or active maintenance indicators are present in the README.

Licensing & Compatibility

The README does not explicitly state a license. The code is primarily based on pyHGT API. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The specified dependencies (PyTorch 1.3.0, PyTorch Geometric 1.3.2) are significantly outdated, potentially posing installation and compatibility challenges with modern systems. The README does not detail the bus factor or ongoing maintenance status.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days