GraphAgent  by HKUDS

Agentic pipeline for graph-enhanced language understanding

created 7 months ago
309 stars

Top 88.0% on sourcepulse

GitHubView on GitHub
Project Summary

GraphAgent is an automated agent pipeline designed to process and leverage real-world data that combines structured graph information with unstructured text. It targets researchers and practitioners working with complex datasets, enabling predictive and generative tasks by integrating language models with graph language models to uncover intricate relational dependencies.

How It Works

GraphAgent employs a three-component agentic pipeline: a Graph Generator Agent builds knowledge graphs to capture semantic dependencies, a Task Planning Agent interprets user queries and plans execution, and a Task Execution Agent automates tool matching and invocation. This approach integrates LLMs with graph-enhanced LLMs to process both explicit graph connections and implicit semantic interdependencies, offering a unified framework for graph-centric AI tasks.

Quick Start & Requirements

  • Installation: Clone the repository, create a conda environment (conda create -n graphagent python=3.11), activate it (conda activate graphagent), and install requirements (pip install -r GraphAgent-inference/requirements.txt).
  • Models: Requires pre-trained checkpoints for GraphAgent (e.g., GraphAgent/GraphAgent-8B), a graph tokenizer (GraphAgent/GraphTokenizer), and a sentence transformer (sentence-transformers/all-mpnet-base-v2). These can be downloaded locally or will be auto-downloaded.
  • API Keys: Requires API keys for LLM calls, with OPENAI_API_KEY needing to be set for the default planner (Deepseek).
  • Inference: Run via bash GraphAgent-inference/run.sh.
  • Resources: Python 3.11, conda. Specific LLM model sizes will dictate VRAM requirements.
  • Docs: Paper, Models

Highlighted Details

  • Achieves state-of-the-art performance on graph predictive tasks (e.g., node classification) and generative tasks (e.g., related work generation).
  • Demonstrates significant improvements over existing graph neural networks and RAG methods on benchmarks like ACM-1000, Arxiv-Papers, and ICLR-Peer Reviews.
  • Integrates a multimodal llama3 model capable of processing graph tokens.
  • Offers a flexible architecture supporting various LLMs and graph data sources.

Maintenance & Community

The project is associated with authors from HKU and is actively being developed, with inference code, model checkpoints, and datasets released. Training code and procedures are noted as "Coming Soon!".

Licensing & Compatibility

The repository does not explicitly state a license in the README. The citation format suggests it is based on academic research. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Training code and procedures for GraphAgent are not yet released. The README indicates that training on custom data is also "Coming Soon!". The primary inference mechanism relies on API-based LLM calls, requiring API key configuration.

Health Check
Last commit

5 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
11 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.