Medical imaging foundation model research paper
Top 69.4% on sourcepulse
This project provides RadFM, a generalist foundation model for radiology, capable of processing both 2D and 3D medical scans with interleaved visual and language inputs. It is designed for researchers and practitioners in medical AI who need a versatile model for various radiological tasks.
How It Works
RadFM leverages a multi-modal generative approach, integrating visual tokens from a 3D Vision Transformer and Perceiver directly into a LLaMA-based causal language model. This allows for seamless fusion of image and text data, enabling tasks like image captioning and diagnosis. A custom My_trainer
and datasampler.py
are used to ensure that 2D and 3D data are not mixed within the same batch, preventing computational overhead from data expansion and improving training efficiency.
Quick Start & Requirements
pytorch_model.bin
in Quick_demo/
, then run python test.py
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
key_words_query
functionality mentioned in the embedding layer is currently unused.1 week ago
1 week