ai-knowledge-graph  by robert-mcdermott

Knowledge graph generator from unstructured text

created 4 months ago
1,083 stars

Top 35.7% on sourcepulse

GitHubView on GitHub
Project Summary

This project generates interactive knowledge graphs from unstructured text using LLMs. It targets researchers, analysts, and developers needing to extract and visualize complex relationships from documents. The system automates knowledge extraction, entity standardization, and relationship inference, producing navigable graph visualizations.

How It Works

The system processes text in overlapping chunks to manage LLM context windows. An LLM extracts Subject-Predicate-Object (SPO) triplets from each chunk. Optionally, it standardizes entity names across the graph for consistency and infers new relationships between disconnected components using LLM analysis and lexical similarity. The final knowledge graph is visualized interactively using PyVis, with nodes sized by importance and colored by community.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt or uv sync.
  • Run: python generate-graph.py --input your_text_file.txt --output knowledge_graph.html.
  • Requires Python 3.11+.
  • Compatible with any OpenAI-compatible API endpoint (Ollama, LM Studio, OpenAI, vLLM, LiteLLM).
  • Configuration is managed via config.toml.
  • See official documentation for detailed usage and configuration.

Highlighted Details

  • Supports multiple LLM backends via OpenAI-compatible API.
  • Features optional entity standardization and relationship inference to enhance graph coherence.
  • Generates interactive visualizations with community detection and node sizing.
  • Outputs both HTML visualizations and raw JSON data.

Maintenance & Community

The project appears to be maintained by a single author, robert-mcdermott. There are no explicit mentions of community channels or roadmaps in the README.

Licensing & Compatibility

The README does not explicitly state a license. The presence of pyproject.toml suggests standard Python packaging, but license terms for commercial use or closed-source linking are not specified.

Limitations & Caveats

The project's reliance on LLMs for extraction and inference means performance and accuracy are dependent on the chosen LLM and prompt quality. The README does not detail specific LLM performance benchmarks or potential failure modes. Community support and long-term maintenance are not clearly indicated.

Health Check
Last commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
972 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.