Graph RAG for medical data research paper
Top 59.8% on sourcepulse
This project provides a Graph Retrieval-Augmented Generation (RAG) system tailored for the medical domain, aiming to enhance the safety and accuracy of medical large language models. It is designed for researchers and developers working with medical data who need to integrate structured knowledge graphs with LLM-based question-answering.
How It Works
The system employs a multi-level RAG approach, leveraging a knowledge graph constructed from medical data. It integrates data from various sources: private user data (like MIMIC IV), curated papers and books (MedC-K, S2ORC), and structured dictionary data (UMLS). The core innovation lies in its hierarchical graph linking, enabling more contextually relevant retrieval for LLM inference. This approach aims to provide safer and more grounded medical information by grounding responses in structured medical knowledge.
Quick Start & Requirements
conda env create -f medgraphrag.yml
jundewu/medrag-post
) for web-based PubMed searches.Highlighted Details
Maintenance & Community
The project is associated with authors Junde Wu and Jiayuan Zhu. Further community engagement channels are not explicitly listed in the README.
Licensing & Compatibility
The README does not explicitly state a license. The use of datasets like MIMIC IV, MedC-K, and UMLS may be subject to their respective licenses and usage agreements, potentially restricting commercial use or redistribution of raw data.
Limitations & Caveats
Accessing and processing the full dataset hierarchy (MIMIC IV, MedC-K, UMLS) can be challenging due to data access requirements and licensing. The project is actively working on providing simpler example datasets to ease implementation.
7 months ago
1 week