knowledge_graph  by rahulnyk

Knowledge graph pipeline for text corpus analysis

created 1 year ago
1,912 stars

Top 23.3% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a Python-based solution for converting any text corpus into a knowledge graph, targeting researchers and developers interested in Graph Augmented Generation (GRAG) or knowledge graph-based QnA. It enables deeper text analysis and more profound conversational AI by representing entities and their relationships.

How It Works

The approach involves splitting text into chunks, extracting concepts (rather than just entities) using a local LLM (Mistral 7B OpenOrca), and inferring relationships based on co-occurrence within chunks. Edges represent text chunks where concepts appear together, with weights derived from multiple occurrences and concatenated relationships. The system also calculates node degrees and communities for visualization sizing and coloring.

Quick Start & Requirements

Highlighted Details

  • Leverages Mistral 7B OpenOrca via Ollama for local, cost-free concept extraction.
  • Utilizes NetworkX for graph manipulation and Pyvis for interactive web-based visualizations.
  • Generates graph metrics like node degree and community structure.
  • Focuses on "concepts" over traditional entities for richer semantic representation.

Maintenance & Community

The project is seeking contributions for backend improvements (embedding deduplication, concept normalization, filtering) and frontend development for interactive graph exploration.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is described as needing "a lot more work" and lists several suggested improvements, indicating it may be in an early or experimental stage. The lack of a specified license could pose a barrier to commercial adoption.

Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
100 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.