GraphRag.Net  by AIDotNet

A .NET GraphRAG framework for advanced document analysis and Q&A

Created 1 year ago
252 stars

Top 99.6% on SourcePulse

GitHubView on GitHub
Project Summary

A .NET implementation of the GraphRAG (Graph-based Retrieval Augmented Generation) approach, this project leverages Semantic Kernel to build knowledge graphs from documents for enhanced question answering. It targets .NET developers seeking to integrate advanced RAG capabilities into their applications, offering a structured way to process, represent, and query information through graph structures and community detection.

How It Works

The project follows a multi-stage process inspired by GraphRAG. It begins by segmenting source documents into text chunks, then extracts entities and relationships to form graph nodes and edges. These are deduplicated and stored in relational and vector databases. Community detection algorithms, specifically Fast Label Propagation, are applied to group related nodes. LLMs then generate summaries for individual elements, communities, and a global overview. Querying involves vector search to find relevant nodes, which can then be expanded into a subgraph or a community subgraph for context-aware answer generation.

Quick Start & Requirements

  • Installation: Install via NuGet: dotnet add package GraphRag.Net. To run the demo project: dotnet run --project GraphRag.Net.Web.csproj.
  • Prerequisites:
    • .NET SDK.
    • LLM API endpoint and key (OpenAI compatible by default, configurable in appsettings.json).
    • Database configuration (SQLite or PostgreSQL for graph data, SQLite for vector data by default).
  • Access: Swagger UI available at http://localhost:5000/swagger, and a Blazor UI at http://localhost:5000/.
  • Configuration: Key settings for LLM, text chunking, database connections, and search parameters are managed in appsettings.json.
  • Demo Data: Pre-trained data can be downloaded and used for testing.

Highlighted Details

  • Implements a multi-layer summarization approach, generating summaries for elements, communities, and globally.
  • Utilizes Fast Label Propagation for community detection and employs a recursive graph expansion strategy for query context.
  • Features token count estimation and dynamic node pruning to manage LLM input size constraints.
  • Includes mechanisms for handling orphan nodes and deduplicating graph elements using semantic similarity.

Maintenance & Community

The project is associated with the AntSK project, with a demo environment provided. Specific details on active maintenance, contributors, or community channels like Discord/Slack are not explicitly detailed in the README.

Licensing & Compatibility

The specific open-source license is not stated in the provided text. Compatibility for commercial use or closed-source linking would depend on the final license.

Limitations & Caveats

This project is presented as a demo example for learning the GraphRAG concept. LLM integration defaults to OpenAI's API specification, requiring configuration for other models or using intermediary services like one-api. Generation of community and global summaries may require manual invocation after data import.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.