kg-gen by stair-lab

Knowledge graph generator for text analysis and RAG

Created 1 year ago

1,035 stars

Top 36.0% on SourcePulse

View on GitHub

1 Expert Loves This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Project Summary

This library extracts knowledge graphs from any text, targeting users building RAG systems, synthetic data for ML, or needing to structure and analyze text relationships. It offers flexibility by supporting various LLM providers via LiteLLM and structured output generation with DSPy.

How It Works

kg-gen leverages large language models (LLMs) to identify entities and relationships within text. It processes input by chunking large documents and can optionally cluster similar entities and relations to normalize the output. The library supports both single text strings and conversational message formats, preserving conversational context for richer graph extraction.

Quick Start & Requirements

Install: pip install kg-gen
Requirements: Python 3.x, LLM API keys or local model setup (e.g., Ollama).
Docs: https://github.com/stair-lab/kg-gen

Highlighted Details

Supports numerous LLM providers via LiteLLM (OpenAI, Ollama, Gemini, Anthropic, etc.).
Features chunking for large texts and clustering for entity/relation normalization.
Handles conversational data by preserving message roles and order.
Allows aggregation of multiple extracted graphs.

Maintenance & Community

The project is maintained by stair-lab. Further community or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive MIT license allows for commercial use and integration with closed-source projects.

Limitations & Caveats

The effectiveness of graph generation is dependent on the chosen LLM's capabilities and the quality of the input text. Clustering and aggregation functionalities are optional and may require fine-tuning for optimal results.

Health Check

Last Commit

1 month ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

81 stars in the last 30 days