neo4j-python-pandas-py2neo-v3  by MazzaWill

Excel to Neo4j knowledge graph builder

Created 7 years ago
579 stars

Top 55.4% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides example code for building knowledge graphs in Neo4j from Excel data, catering to both educational/legacy use cases and modern AI-driven applications. It offers a dual-path approach, supporting the original py2neo v3 stack alongside current Neo4j drivers, vector indexes, and GraphRAG capabilities, benefiting learners and developers integrating tabular data with graph databases.

How It Works

The project facilitates knowledge graph construction by first reading invoice-style Excel data using pandas. It then extracts node and relationship information, which is used to create Neo4j nodes and relationships. The legacy path utilizes py2neo v3, while a modern, additive example employs the official Neo4j Python driver, Neo4j 5+/2026 vector indexes, and GraphRAG for semantic retrieval. Additionally, it includes functionality to convert Neo4j graph data into matrices for downstream machine learning experiments.

Quick Start & Requirements

For the legacy environment, install dependencies using pip install -r requirements.txt. The original working environment was Python 3.6.5, Windows 10, Neo4j 3.x, and py2neo 3. Users must update local paths and Neo4j connection settings in dataToNeo4jClass/DataToNeo4jClass.py. The modern example can be started with: bash python -m examples.modern_invoice_graphrag.app \ --input examples/modern_invoice_graphrag/sample_invoice_rows.csv \ --limit 2 \ dry-run Sample data Invoice_data_Demo.xls is included.

Highlighted Details

  • Dual-path approach: Supports legacy py2neo v3 and modern Neo4j GraphRAG/vector search stacks.
  • Open-source agent skill (skills/neo4j-knowledge-graph/) for AI coding agents to design Neo4j knowledge graphs from CSV/Excel.
  • Converts Neo4j graph data into matrices for machine learning.
  • Includes example scripts for data extraction, node/edge creation, and matrix conversion.

Maintenance & Community

Maintenance was resumed in June 2026, focusing on keeping the legacy py2neo v3 example usable while adding modern Neo4j features. Modernization efforts are tracked separately. Issue triage is ongoing, with specific requirements for reporting bugs. Public governance includes CONTRIBUTING.md, SECURITY.md, SUPPORT.md, and CODE_OF_CONDUCT.md.

Licensing & Compatibility

The project is licensed under the MIT license, permitting commercial use. However, the repository intentionally maintains legacy dependencies; modernization is tracked separately, and newer Python, pandas, Neo4j, and py2neo versions may require code modifications.

Limitations & Caveats

The primary environment for the legacy examples is pinned to older versions (Python 3.6.5, Neo4j 3.x, py2neo 3). Dependency and security modernization is an ongoing effort tracked in issue #23, and users may encounter compatibility issues or require code adjustments when using more recent software versions.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
4
Issues (30d)
1
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Elvis Saravia Elvis Saravia(Founder of DAIR.AI).

llm-graph-builder by neo4j-labs

0.5%
5k
LLM app builds Neo4j graphs from unstructured data
Created 2 years ago
Updated 1 day ago
Feedback? Help us improve.