Discover and explore top open-source AI tools and projects—updated daily.
juanceresaBuild interactive knowledge graphs from any documents
Top 68.3% on SourcePulse
This project addresses the challenge of transforming unstructured documents into structured, explorable knowledge graphs. It targets researchers, analysts, and power users who need to quickly understand complex relationships within large document collections without extensive coding or infrastructure setup. The primary benefit is rapid, interactive knowledge graph generation directly from the command line, enabling deep insights into document connections.
How It Works
sift-kg employs a multi-stage pipeline: documents are first processed for text extraction, supporting over 75 formats and optional OCR for scanned content. An LLM then performs schema discovery, or uses a predefined domain, to identify entity and relation types relevant to the corpus. These are used for extraction, generating a NetworkX-based knowledge graph. A key differentiator is the human-in-the-loop entity resolution, where the LLM proposes merges, but users must approve them via an interactive terminal UI or by editing YAML files, ensuring accuracy and control. The process concludes with an interactive browser-based viewer and export options.
Quick Start & Requirements
pip install sift-kgpip install sift-kg[embeddings]).sift init, configure API keys in .env, then run sift extract, sift build, and sift view.Highlighted Details
Maintenance & Community
The provided README does not detail specific contributors, sponsorships, or community channels like Discord or Slack.
Licensing & Compatibility
Limitations & Caveats
The tool relies on external LLM APIs, incurring potential costs and requiring API key management. While offering local OCR options, setting up advanced OCR backends or embedding models introduces additional dependencies. The "no code" claim applies to the primary CLI workflow; programmatic use requires Python scripting. For high-accuracy use cases, the human-in-the-loop review process for entity resolution is essential and requires user time.
2 weeks ago
Inactive
neo4j-labs