OpenGraph  by HKUDS

Research paper for graph foundation model pre-training

created 1 year ago
315 stars

Top 86.9% on sourcepulse

GitHubView on GitHub
Project Summary

OpenGraph is a foundation model for graph learning, designed to achieve zero-shot generalizability across diverse graph datasets. It targets researchers and practitioners in graph neural networks and machine learning, offering a unified approach to handle unseen graph structures and properties by leveraging insights from Large Language Models (LLMs).

How It Works

OpenGraph employs a unified graph tokenizer to adapt to new graph data, even with differing properties from training sets. A scalable graph transformer serves as the core encoder, efficiently capturing node dependencies within global topological context. To combat data scarcity, it integrates an LLM-enhanced data augmentation mechanism, improving performance on real-world graph learning tasks.

Quick Start & Requirements

  • Install: Clone the repository and install dependencies using pip.
  • Prerequisites: Python 3.10.13, PyTorch 1.13.0, NumPy 1.23.4, SciPy 1.9.3. Data files in datasets/ require manual unzipping. Pre-trained models must be downloaded separately. OpenAI API key is needed for graph generation.
  • Usage:
    • For testing link prediction: cd link_prediction/ && python main.py --load pretrn_gen1 --epoch 0
    • For testing node classification: cd node_classification/ && python main.py --load pretrn_gen1 --tstdata cora
  • Links: Models, Usage Examples

Highlighted Details

  • Achieves state-of-the-art zero-shot performance compared to few-shot baselines.
  • LLM-enhanced data augmentation improves generalization.
  • Topology-aware graph tokenizer and adjacency smoothing are crucial for performance.
  • Token sequence sampling positively impacts model performance.

Maintenance & Community

The project is associated with EMNLP 2024. No specific community channels or active maintenance signals are detailed in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project requires manual unzipping of dataset files and separate download of pre-trained models. Graph generation requires an OpenAI API key. The README does not specify a license, which may impact commercial adoption.

Health Check
Last commit

9 months ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.