Discover and explore top open-source AI tools and projects—updated daily.
yifanfeng97LLM-driven framework for transforming text into structured knowledge
Top 59.9% on SourcePulse
Transforms unstructured text into structured knowledge using LLMs, enabling users to generate Knowledge Abstracts from documents. It supports a wide array of output formats, including simple collections, Pydantic models, and complex Knowledge Graphs, Hypergraphs, and Spatio-Temporal Graphs. This framework is designed for engineers, researchers, and power users seeking to efficiently extract and understand information from diverse text sources, offering a "stop reading, start understanding" paradigm.
How It Works
Hyper-Extract employs a three-layer architecture: Auto-Types define structured output formats (e.g., AutoGraph, AutoHypergraph, AutoSpatioTemporalGraph), Methods provide extraction algorithms (including RAG-based approaches like GraphRAG and Hyper-RAG), and Templates offer domain-specific configurations. This design allows for declarative, zero-code extraction via YAML templates, supporting over 80 presets across six domains. The framework facilitates incremental knowledge evolution, allowing continuous updates as new documents are processed.
Quick Start & Requirements
uv tool install hyperextractuv pip install hyperextracthttps://github.com/yifanfeng97/hyper-extract.git, cd hyper-extract, then uv sync.gpt-4o-mini and text-embedding-3-small.https://yifanfeng97.github.io/Hyper-Extract/latest/examples/en/ directory within the repository.Highlighted Details
he) and a Python API for seamless integration into existing workflows.Maintenance & Community
The project is marked as active. Contributions via Issues and Pull Requests are welcomed. Specific community channels (e.g., Discord, Slack) or a public roadmap are not detailed in the provided README.
Licensing & Compatibility
Licensed under the Apache-2.0 license. This license is permissive and generally compatible with commercial use and linking within closed-source projects.
Limitations & Caveats
The default configuration relies on OpenAI API keys, which may incur costs and introduce vendor-specific dependencies. While the framework supports multiple extraction engines, detailed performance benchmarks or comparisons against non-OpenAI LLM providers are not explicitly presented.
1 day ago
Inactive