kg-gen  by stair-lab

Knowledge graph generator for text analysis and RAG

created 9 months ago
550 stars

Top 59.0% on sourcepulse

GitHubView on GitHub
Project Summary

This library extracts knowledge graphs from any text, targeting users building RAG systems, synthetic data for ML, or needing to structure and analyze text relationships. It offers flexibility by supporting various LLM providers via LiteLLM and structured output generation with DSPy.

How It Works

kg-gen leverages large language models (LLMs) to identify entities and relationships within text. It processes input by chunking large documents and can optionally cluster similar entities and relations to normalize the output. The library supports both single text strings and conversational message formats, preserving conversational context for richer graph extraction.

Quick Start & Requirements

Highlighted Details

  • Supports numerous LLM providers via LiteLLM (OpenAI, Ollama, Gemini, Anthropic, etc.).
  • Features chunking for large texts and clustering for entity/relation normalization.
  • Handles conversational data by preserving message roles and order.
  • Allows aggregation of multiple extracted graphs.

Maintenance & Community

The project is maintained by stair-lab. Further community or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive MIT license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

The effectiveness of graph generation is dependent on the chosen LLM's capabilities and the quality of the input text. Clustering and aggregation functionalities are optional and may require fine-tuning for optimal results.

Health Check
Last commit

1 week ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
110 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.