graph_maker  by rahulnyk

Python library for knowledge graph creation from text

created 1 year ago
491 stars

Top 63.7% on sourcepulse

GitHubView on GitHub
Project Summary

This Python library, knowledge-graph-maker, enables the creation of knowledge graphs from unstructured text by leveraging Large Language Models (LLMs) and a defined ontology. It's designed for researchers and developers looking to extract structured relationships from text for analysis, graph algorithms, or Retrieval Augmented Generation (RAG) applications.

How It Works

The library processes text by first defining an ontology that specifies entity labels and relationships. Text is then chunked to accommodate LLM context windows. Each chunk is converted into a Document object, which includes metadata for contextualizing extracted relationships. Users select an LLM client (e.g., Groq, OpenAI) to process these documents, extracting graph edges based on the ontology. The output is a list of edges, which can optionally be saved to Neo4j for further analysis or visualization.

Quick Start & Requirements

  • Install via pip: $ pip install knowledge-graph-maker
  • For project setup, use Poetry: $ poetry config --local virtualenvs.in-project true && poetry install
  • Requires an LLM API key (Groq or OpenAI).
  • Official documentation and package GitHub page are referenced for detailed usage and custom LLM client integration.

Highlighted Details

  • Supports custom LLM client integration.
  • Outputs graph data as a list of Edge Pydantic models.
  • Includes an optional step to save extracted graphs to Neo4j.
  • Demonstrates a novel approach to Graph Retrieval Augmented Generation (GRAG).

Maintenance & Community

The project is maintained by rahulnyk. Further community or roadmap information is not detailed in the README.

Licensing & Compatibility

The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The library relies on LLM performance for accurate ontology adherence and relationship extraction. Chunking strategy is critical due to LLM context window limitations. API rate limits may necessitate delays between processing documents, as indicated by the delay_s_between parameter.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
28 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.