Chinese medical multimodal model for chest X-ray summarization
Top 37.6% on sourcepulse
XrayGLM is a Chinese multimodal large language model designed for chest radiograph summarization and interactive dialogue. It addresses the gap in medical multimodal research for the Chinese language, offering potential for medical image diagnosis and conversational AI in healthcare.
How It Works
XrayGLM is built by fine-tuning the VisualGLM-6B model on a custom Chinese chest X-ray dataset. This dataset was created by translating English reports from MIMIC-CXR and OpenI datasets using ChatGPT, aiming to improve accessibility and research for the Chinese medical community. The approach leverages existing powerful multimodal architectures and augments them with domain-specific, multilingual data.
Quick Start & Requirements
pip install -r requirements.txt
(or requirements_wo_ds.txt
to skip deepspeed
).SwissArmyTransformer>=0.3.6
.cli_demo.py
or web_demo.py
with specified model checkpoints.Highlighted Details
Maintenance & Community
The project is a student-led initiative from the Macao Polytechnic University. Key contributors and advisors are listed. OpenAI credits were provided by USTC-PhD Yongle Luo for data translation.
Licensing & Compatibility
Limitations & Caveats
The provided checkpoints are labeled as "low quality." The model's output is not guaranteed for accuracy and should not be used for actual medical diagnosis. The project is intended strictly for academic research.
8 months ago
1 day