RadFM by chaoyi-wu

Medical imaging foundation model research paper

Created 2 years ago

524 stars

Top 60.1% on SourcePulse

Project Summary

This project provides RadFM, a generalist foundation model for radiology, capable of processing both 2D and 3D medical scans with interleaved visual and language inputs. It is designed for researchers and practitioners in medical AI who need a versatile model for various radiological tasks.

How It Works

RadFM leverages a multi-modal generative approach, integrating visual tokens from a 3D Vision Transformer and Perceiver directly into a LLaMA-based causal language model. This allows for seamless fusion of image and text data, enabling tasks like image captioning and diagnosis. A custom My_trainer and datasampler.py are used to ensure that 2D and 3D data are not mixed within the same batch, preventing computational overhead from data expansion and improving training efficiency.

Quick Start & Requirements

Install/Run: Download model checkpoint, decompress, place pytorch_model.bin in Quick_demo/, then run python test.py.
Prerequisites: Requires at least one Nvidia A100 (80GB) GPU for acceptable inference performance.
Resources: Significant GPU memory is needed.
Links: Quick Demo, Datasets

Highlighted Details

Trained on a large-scale MedMD dataset comprising 16 million 2D and 3D medical images.
Supports multi-image input and visual-language interleaving.
Custom trainer prevents mixing 2D/3D data in batches for efficiency.
Utilizes a Perceiver and 3D ViT for image tokenization.

Maintenance & Community

Contact: wtzxxxwcy02@sjtu.edu.cn
Datasets are released and available on Hugging Face.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Inference requires a high-end GPU (Nvidia A100 80GB) to avoid extremely slow performance.
The key_words_query functionality mentioned in the embedding layer is currently unused.

Health Check

Last Commit

7 months ago

Responsiveness

1 week

Pull Requests (30d)

0

Issues (30d)

0

Star History

3 stars in the last 30 days

Explore Similar Projects

DatasetDM by showlab

Research paper for synthesizing data with diffusion models and perception annotations

Created 2 years ago

Updated 2 years ago

CVPR-MIA by MedAIerHHL

Curated list of medical image analysis papers from CVPR

Created 2 years ago

Updated 8 months ago

Awesome-Foundation-Models-in-Medical-Imaging by xmindflow

Curated list of medical imaging foundation models and papers

Created 2 years ago

Updated 1 year ago

GroundingGPT by lzw-lzw

Multimodal grounding model (research paper)

Created 2 years ago

Updated 1 year ago

SAT by zhaoziheng

Universal 3D medical image segmentation via text prompts

Created 2 years ago

Updated 1 month ago

M3D by BAAI-DCAI

Multi-modal LLM for 3D medical image analysis

Created 1 year ago

Updated 10 months ago

Awesome-Unified-Multimodal-Models by AIDC-AI

Curated list of unified multimodal models, papers, and datasets

Created 9 months ago

Updated 2 weeks ago

Starred by

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera).

CONCH by mahmoodlab

Vision-language model for computational pathology

Created 2 years ago

Updated 11 months ago

awesome-multimodal-in-medical-imaging by richard-peng-xia

Curated list of multimodal learning resources in medical imaging

Created 3 years ago

Updated 2 weeks ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and

Elvis Saravia

Elvis Saravia(Founder of DAIR.AI).

SAM-Med2D by OpenGVLab

Medical image segmentation model based on SAM

Created 2 years ago

Updated 1 year ago

Starred by

Luca Antiga

Luca Antiga(CTO of Lightning AI),

Elvis Saravia

Elvis Saravia(Founder of DAIR.AI), and

1 more.

torchio by TorchIO-project

TorchIO: Python package for medical image processing in AI

Created 6 years ago

Updated 2 weeks ago

minimind-v by jingyaogong

VLM for training vision-language models from scratch

Created 1 year ago

Updated 3 weeks ago

Feedback? Help us improve.