automatic-KG-creation-with-LLM  by fusion-jena

KG construction pipeline using LLMs

created 1 year ago
269 stars

Top 96.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a pipeline for the semi-automatic construction of Ontologies and Knowledge Graphs (KGs) using Large Language Models (LLMs). It targets researchers and practitioners in knowledge representation and AI, offering a reproducible framework to generate KGs from scholarly publications with minimal human expert intervention.

How It Works

The approach leverages LLMs (Mixtral 8x22B, GPT-4o, GPT-3.5, Gemini) to automate KG creation. The pipeline begins with formulating competency questions (CQs), then uses these CQs to guide the development of an ontology (TBox). This ontology is subsequently employed to construct the KG from source documents, with evaluation metrics provided for assessing the output.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Set Hugging Face access token in helper_functions.py.
  • Tested with Python 3.10.16 on Linux.
  • Execute pipeline via main.py, configuring with config.ini.

Highlighted Details

  • Evaluates four state-of-the-art LLMs for KG construction.
  • Includes code for data preprocessing, prompt engineering, ontology generation, and KG evaluation.
  • Organizes experimental outputs including generated KGs, ontologies, and NER results.
  • Demonstrates feasibility on deep learning methodologies using scholarly publications.

Maintenance & Community

The project is associated with research publications, with citations provided for different versions. No specific community channels (e.g., Discord, Slack) are mentioned in the README.

Licensing & Compatibility

Licensed under Apache License 2.0. This license is permissive and generally compatible with commercial use and closed-source linking.

Limitations & Caveats

The README indicates the code was tested on Python 3.10.16, suggesting potential compatibility issues with other Python versions. The "semi-automatic" nature implies some level of human input or oversight may still be required.

Health Check
Last commit

3 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
34 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.