CLI tool for knowledge graph construction from documents
Top 86.7% on sourcepulse
Docs2KG offers a unified approach to constructing knowledge graphs from diverse document types, targeting researchers and developers who need to extract structured information from unstructured text. It leverages a human-LLM collaborative framework to improve the quality and efficiency of knowledge graph generation.
How It Works
Docs2KG employs a hybrid bottom-up and top-down strategy, integrating Large Language Models (LLMs) for knowledge graph and ontology construction. It categorizes knowledge into MetaKG (document metadata), LayoutKG (document structure), and SemanticKG (content entities and relations). A key feature is its human-LLM collaborative interface, enabling iterative refinement of the knowledge graph based on human feedback, which in turn enhances the LLM's performance.
Quick Start & Requirements
pip install Docs2KG
and python -m spacy download en_core_web_sm
.CONFIG_FILE
environment variable. Commands include docs2kg process-document
, docs2kg batch-process
, docs2kg list-formats
, and docs2kg neo4j
.Highlighted Details
Maintenance & Community
The project is associated with AI4WA. Further community or maintenance details are not explicitly provided in the README.
Licensing & Compatibility
The README does not explicitly state the license. It provides an arXiv citation, suggesting it is research-oriented. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project is presented as a research contribution (arXiv:2406.02962), implying it may be in an early stage of development. Specific limitations regarding supported LLMs, scalability, or robustness in production environments are not detailed.
2 months ago
Inactive