G-Retriever  by XiaoxinHe

Research paper implementation for graph-based question answering

created 1 year ago
473 stars

Top 65.3% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

G-Retriever is a question-answering framework designed for textual graph understanding and question answering on real-world graphs. It targets researchers and practitioners in areas like scene graph understanding, common sense reasoning, and knowledge graph reasoning, offering enhanced graph comprehension through a novel integration of GNNs, LLMs, and RAG.

How It Works

G-Retriever combines Graph Neural Networks (GNNs) for graph representation, Large Language Models (LLMs) for generation, and Retrieval-Augmented Generation (RAG) for context. This hybrid approach leverages soft prompting to fine-tune the LLM, enabling it to better understand and reason over graph structures, leading to improved accuracy in question answering tasks.

Quick Start & Requirements

  • Install: Requires PyTorch 2.0.1 with CUDA 11.8, PyG libraries, peft, pandas, ogb, transformers, wandb, sentencepiece, datasets, pcst_fast, gensim, scipy==1.12, and protobuf.
  • LLM: Access to Llama 2 (7b-hf) via Hugging Face, requiring a Hugging Face account and access token.
  • Data Preprocessing: Commands provided for expla_graphs, scene_graphs, and webqsp datasets.
  • Training: Scripts for inference-only LLM, frozen LLM with prompt tuning, fine-tuned LLM with LoRA, and G-Retriever with LoRA.
  • Reproducibility: A run.sh script is available for reproducing paper results.

Highlighted Details

  • Integrates GNNs, LLMs, and RAG for textual graph QA.
  • Supports fine-tuning via soft prompting for enhanced graph understanding.
  • Official implementation for NeurIPS 2024 paper "G-Retriever".
  • PyG 2.6 compatibility noted.

Maintenance & Community

  • NeurIPS 2024 publication.
  • No explicit community links (Discord/Slack) or roadmap mentioned in the README.

Licensing & Compatibility

  • The README does not explicitly state a license. The repository name and structure suggest it's intended for research purposes. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The setup requires specific versions of PyTorch and CUDA, and access to Llama 2 models which necessitates a Hugging Face account and token. The README does not detail performance benchmarks beyond the paper's claims or provide extensive documentation beyond setup and training commands.

Health Check
Last commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
33 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.