OpenGraph by HKUDS

Research paper for graph foundation model pre-training

Created 1 year ago

319 stars

Top 85.0% on SourcePulse

Project Summary

OpenGraph is a foundation model for graph learning, designed to achieve zero-shot generalizability across diverse graph datasets. It targets researchers and practitioners in graph neural networks and machine learning, offering a unified approach to handle unseen graph structures and properties by leveraging insights from Large Language Models (LLMs).

How It Works

OpenGraph employs a unified graph tokenizer to adapt to new graph data, even with differing properties from training sets. A scalable graph transformer serves as the core encoder, efficiently capturing node dependencies within global topological context. To combat data scarcity, it integrates an LLM-enhanced data augmentation mechanism, improving performance on real-world graph learning tasks.

Quick Start & Requirements

Install: Clone the repository and install dependencies using pip.
Prerequisites: Python 3.10.13, PyTorch 1.13.0, NumPy 1.23.4, SciPy 1.9.3. Data files in datasets/ require manual unzipping. Pre-trained models must be downloaded separately. OpenAI API key is needed for graph generation.
Usage:
- For testing link prediction: cd link_prediction/ && python main.py --load pretrn_gen1 --epoch 0
- For testing node classification: cd node_classification/ && python main.py --load pretrn_gen1 --tstdata cora
Links: Models, Usage Examples

Highlighted Details

Achieves state-of-the-art zero-shot performance compared to few-shot baselines.
LLM-enhanced data augmentation improves generalization.
Topology-aware graph tokenizer and adjacency smoothing are crucial for performance.
Token sequence sampling positively impacts model performance.

Maintenance & Community

The project is associated with EMNLP 2024. No specific community channels or active maintenance signals are detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project requires manual unzipping of dataset files and separate download of pre-trained models. Graph generation requires an OpenAI API key. The README does not specify a license, which may impact commercial adoption.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days