Research paper for chest radiograph summarization using medical vision-language models
Top 61.6% on sourcepulse
XrayGPT addresses the challenge of automated chest radiograph summarization by leveraging medical vision-language models. It is designed for researchers and developers in medical AI, aiming to improve diagnostic efficiency and patient communication through accurate, LLM-generated radiology reports.
How It Works
XrayGPT aligns a frozen medical visual encoder (MedClip) with a fine-tuned LLM (Vicuna) using a simple linear transformation. This approach is advantageous as it leverages pre-trained, domain-specific models and a curated dataset of medical conversations to enhance LLM performance on radiology reports. The model is trained in two stages: initial pre-training on MIMIC-CXR and fine-tuning on the OpenI dataset.
Quick Start & Requirements
conda env create -f env.yml
or conda create -n xraygpt python=3.9 && conda activate xraygpt && pip install -r xraygpt_requirements.txt
).Highlighted Details
Maintenance & Community
The project is associated with Mohamed bin Zayed University of Artificial Intelligence. Further community or roadmap information is not explicitly detailed in the README.
Licensing & Compatibility
Limitations & Caveats
The CC BY-NC-SA license restricts commercial use. The project relies on specific pre-trained model weights (Vicuna-7B v1, MiniGPT-4) which require separate downloads and configuration. Training is demonstrated on AMD MI250X GPUs, suggesting potential hardware-specific considerations.
1 year ago
1+ week