XrayGPT  by mbzuai-oryx

Research paper for chest radiograph summarization using medical vision-language models

created 2 years ago
515 stars

Top 61.6% on sourcepulse

GitHubView on GitHub
Project Summary

XrayGPT addresses the challenge of automated chest radiograph summarization by leveraging medical vision-language models. It is designed for researchers and developers in medical AI, aiming to improve diagnostic efficiency and patient communication through accurate, LLM-generated radiology reports.

How It Works

XrayGPT aligns a frozen medical visual encoder (MedClip) with a fine-tuned LLM (Vicuna) using a simple linear transformation. This approach is advantageous as it leverages pre-trained, domain-specific models and a curated dataset of medical conversations to enhance LLM performance on radiology reports. The model is trained in two stages: initial pre-training on MIMIC-CXR and fine-tuning on the OpenI dataset.

Quick Start & Requirements

  • Installation: Clone the repository and create a Conda environment (conda env create -f env.yml or conda create -n xraygpt python=3.9 && conda activate xraygpt && pip install -r xraygpt_requirements.txt).
  • Prerequisites: Requires Vicuna-7B v1 weights, MiniGPT-4 checkpoints, and preprocessed MIMIC-CXR and OpenI datasets. Training utilizes AMD MI250X GPUs.
  • Resources: Setup involves downloading large datasets and model weights. Training requires multiple GPUs.
  • Links: Online Demo, Technical Report

Highlighted Details

  • Fine-tuned Vicuna LLM on 100k patient-doctor and 30k radiology conversations.
  • Generated ~217k summaries from MIMIC-CXR and OpenI datasets.
  • Built upon MiniGPT-4, MedCLIP, BLIP-2, and Lavis frameworks.
  • Accepted at BIONLP-ACL 2024.

Maintenance & Community

The project is associated with Mohamed bin Zayed University of Artificial Intelligence. Further community or roadmap information is not explicitly detailed in the README.

Licensing & Compatibility

  • License: CC BY-NC-SA.
  • Restrictions: Non-commercial use and share-alike clauses apply.

Limitations & Caveats

The CC BY-NC-SA license restricts commercial use. The project relies on specific pre-trained model weights (Vicuna-7B v1, MiniGPT-4) which require separate downloads and configuration. Training is demonstrated on AMD MI250X GPUs, suggesting potential hardware-specific considerations.

Health Check
Last commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.