Graph-R1  by LHRLAB

Agentic GraphRAG framework using end-to-end reinforcement learning

Created 6 months ago
387 stars

Top 74.0% on SourcePulse

GitHubView on GitHub
Project Summary

Graph-R1 introduces an end-to-end reinforcement learning framework to enhance LLM reasoning on graph-structured knowledge, addressing the disconnect between language and graph modalities in GraphRAG systems. It targets researchers and developers in knowledge-intensive domains like healthcare, finance, and law, aiming to improve answer quality by enabling LLMs to iteratively query and refine information from knowledge hypergraphs.

How It Works

The framework constructs a knowledge hypergraph using n-ary relation extraction. It then employs an explicit reward mechanism within RL to guide LLMs through a "think–generate query–retrieve subgraph–rethink" reasoning cycle. This iterative process allows the LLM to dynamically leverage graph knowledge, leading to more accurate and contextually relevant answers.

Quick Start & Requirements

  • Installation: Requires Python 3.11.11, PyTorch 2.4.0 with CUDA 12.4, and FlashAttention. Installation involves creating a conda environment, activating it, and installing dependencies via pip.
  • Data: Experiments are conducted on six datasets (2WikiMultiHopQA, HotpotQA, Musique, NQ, PopQA, TriviaQA), downloadable from TeraBox.
  • Setup: Includes scripts for dataset preprocessing, knowledge hypergraph building (optional, requires OpenAI API key for GPT-4o-mini), and setting up a retrieval server.
  • Training: Requires 4 x 48GB GPUs for training with models like Qwen2.5-3B-Instruct using GRPO, REINFORCE++, or PPO algorithms.
  • Links: Paper, Datasets, Pre-built Hypergraph

Highlighted Details

  • End-to-end RL framework for agentic GraphRAG.
  • Iterative "think-query-retrieve-rethink" reasoning cycle.
  • Supports multiple RL algorithms (GRPO, REINFORCE++, PPO).
  • Evaluated on six diverse multi-hop question-answering datasets.

Maintenance & Community

The project is associated with the authors listed in the paper and acknowledges contributions from several related projects. Contact email: haoran-luo@outlook.com.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The framework has significant hardware requirements, specifically needing multiple high-end GPUs (4 x 48GB) for training. The knowledge hypergraph construction step relies on an external API key (OpenAI GPT-4o-mini), and the license status requires clarification for broader adoption.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
7
Star History
57 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.