G-Retriever  by XiaoxinHe

Research paper implementation for graph-based question answering

Created 2 years ago
530 stars

Top 59.7% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

G-Retriever is a question-answering framework designed for textual graph understanding and question answering on real-world graphs. It targets researchers and practitioners in areas like scene graph understanding, common sense reasoning, and knowledge graph reasoning, offering enhanced graph comprehension through a novel integration of GNNs, LLMs, and RAG.

How It Works

G-Retriever combines Graph Neural Networks (GNNs) for graph representation, Large Language Models (LLMs) for generation, and Retrieval-Augmented Generation (RAG) for context. This hybrid approach leverages soft prompting to fine-tune the LLM, enabling it to better understand and reason over graph structures, leading to improved accuracy in question answering tasks.

Quick Start & Requirements

  • Install: Requires PyTorch 2.0.1 with CUDA 11.8, PyG libraries, peft, pandas, ogb, transformers, wandb, sentencepiece, datasets, pcst_fast, gensim, scipy==1.12, and protobuf.
  • LLM: Access to Llama 2 (7b-hf) via Hugging Face, requiring a Hugging Face account and access token.
  • Data Preprocessing: Commands provided for expla_graphs, scene_graphs, and webqsp datasets.
  • Training: Scripts for inference-only LLM, frozen LLM with prompt tuning, fine-tuned LLM with LoRA, and G-Retriever with LoRA.
  • Reproducibility: A run.sh script is available for reproducing paper results.

Highlighted Details

  • Integrates GNNs, LLMs, and RAG for textual graph QA.
  • Supports fine-tuning via soft prompting for enhanced graph understanding.
  • Official implementation for NeurIPS 2024 paper "G-Retriever".
  • PyG 2.6 compatibility noted.

Maintenance & Community

  • NeurIPS 2024 publication.
  • No explicit community links (Discord/Slack) or roadmap mentioned in the README.

Licensing & Compatibility

  • The README does not explicitly state a license. The repository name and structure suggest it's intended for research purposes. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The setup requires specific versions of PyTorch and CUDA, and access to Llama 2 models which necessitates a Hugging Face account and token. The README does not detail performance benchmarks beyond the paper's claims or provide extensive documentation beyond setup and training commands.

Health Check
Last Commit

11 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.