graphrag  by microsoft

Modular RAG system using graph-based knowledge representation

created 1 year ago
26,934 stars

Top 1.5% on sourcepulse

GitHubView on GitHub
Project Summary

GraphRAG is a modular, graph-based Retrieval-Augmented Generation (RAG) system designed to extract structured data from unstructured text using LLMs. It aims to enhance LLM reasoning capabilities over private data by leveraging knowledge graph memory structures. The target audience includes developers and researchers working with LLMs who need to improve their models' understanding and generation based on complex, private datasets.

How It Works

GraphRAG employs a data pipeline and transformation suite that utilizes LLMs to extract meaningful, structured data from unstructured text. It builds knowledge graph memory structures to augment LLM outputs, enabling more sophisticated reasoning. This approach aims to provide a more robust and context-aware RAG system compared to traditional methods.

Quick Start & Requirements

  • Installation: The recommended approach is to use the Solution Accelerator package for an end-to-end experience with Azure resources.
  • Prerequisites: Specific requirements are detailed in the documentation, including potential costs associated with indexing.
  • Resources: Indexing operations can be expensive; users are advised to read documentation and start with small datasets.
  • Documentation: https://microsoft.github.io/graphrag/

Highlighted Details

  • Modular, graph-based RAG system.
  • Focuses on extracting structured data from unstructured text.
  • Enhances LLM reasoning with knowledge graph memory.
  • Requires prompt tuning for optimal results.

Maintenance & Community

  • The code is a demonstration and not officially supported by Microsoft.
  • Discussions and feedback are encouraged via GitHub Discussions.
  • Contribution guidelines are available in CONTRIBUTING.md.

Licensing & Compatibility

  • The README does not explicitly state a license.
  • Use of Microsoft trademarks is subject to brand guidelines.

Limitations & Caveats

GraphRAG indexing can be an expensive operation. Prompt tuning is strongly recommended for optimal results, and users should consult the prompt tuning guide. Versioning requires running graphrag init --root [path] --force between minor version bumps and a migration notebook between major version bumps to avoid re-indexing.

Health Check
Last commit

1 day ago

Responsiveness

1 week

Pull Requests (30d)
15
Issues (30d)
11
Star History
2,193 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.