MedRAG by SNOWTEAM2023

Healthcare copilot enhancing diagnostic accuracy

Created 1 year ago

266 stars

Top 96.4% on SourcePulse

Project Summary

Summary

MedRAG enhances Retrieval-Augmented Generation (RAG) for healthcare copilot applications by integrating Knowledge Graph (KG)-elicited reasoning. It targets healthcare professionals, aiming to improve diagnostic accuracy and reduce misdiagnosis risk for complex or similar diseases through precise diagnostic support and personalized treatment recommendations.

How It Works

The core approach involves constructing a hierarchical disease knowledge graph and combining it with EHR retrieval for RAG-based reasoning. This KG-enhanced reasoning aims to provide more accurate diagnostic insights and personalized suggestions by integrating multi-level patient information and disease relationships.

Quick Start & Requirements

Install: Clone the repository (git clone https://github.com/SNOWTEAM2023/MedRAG.git), navigate into the directory (cd MedRAG), and install dependencies (pip install -r requirements.txt).
Prerequisites: Requires OpenAI and Hugging Face API tokens, which must be manually inserted into authentication.py.
Dataset: Utilizes the DDXPlus (synthesized EHR) and CPDD (private chronic pain) datasets. Dataset preprocessing involves modifying KG_Retrieve.py and using the AI Data Set with Categories.csv file.
Run: Execute the main script: python main.py.
Links: Paper on arXiv, Demo on YouTube/Bilibili.

Highlighted Details

Accepted by The Web Conference (WWW) 2025 and its demo paper by IJCAI 2025.
Demonstrated superior performance on DDXPlus and CPDD datasets, particularly on the L3 diagnostic metric.
KG-elicited reasoning significantly boosts diagnostic accuracy across various LLM backbones.
Introduced a novel voice modality in the IJCAI'25 demo.

Maintenance & Community

The project has garnered media attention from outlets like Medium and AI Era. No explicit community channels (e.g., Discord, Slack) or roadmap links are provided in the README.

Licensing & Compatibility

No specific open-source license is mentioned in the provided README text, which is a critical omission for assessing commercial compatibility or usage restrictions.

Limitations & Caveats

Functionality is dependent on external API keys (OpenAI, Hugging Face). Dataset preparation requires direct modification of Python scripts. The absence of a stated license poses a significant adoption blocker.

Health Check

Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

6 stars in the last 30 days