SAT by zhaoziheng

Universal 3D medical image segmentation via text prompts

Created 2 years ago

280 stars

Top 93.1% on SourcePulse

Project Summary

Large-vocabulary segmentation for medical images is addressed by SAT, a knowledge-enhanced universal segmentation model. It targets researchers and practitioners needing versatile segmentation across multiple modalities and anatomical structures. The primary benefit is a single, efficient model capable of segmenting hundreds of classes, replacing the need for numerous specialist models.

How It Works

SAT implements a knowledge-enhanced universal segmentation approach. It is built upon an extensive collection of 72 public 3D medical segmentation datasets, enabling segmentation of 497 classes across MR, CT, and PET modalities. Segmentation is guided by text prompts using anatomical terminology, offering a unified and efficient solution compared to training individual specialist models.

Quick Start & Requirements

Installation: Install dependencies via pip install -e dynamic-network-architectures-main (from the model directory). Key requirements include torch>=1.10.0, numpy==1.21.5, monai==1.1.0, transformers==4.21.3, nibabel==4.0.2, einops==0.6.1, and positional_encodings==6.0.1. Install mamba_ssm for the U-Mamba variant.
Prerequisites: Specific Python library versions are required. Checkpoints for SAT and the Text Encoder must be downloaded from Hugging Face. Data preparation follows a specified jsonl format.
GPU Memory: Inference requires significant GPU memory: ~34GB (batch size 1) to ~62GB (batch size 2) for SAT-Pro, and ~24GB (batch size 1) to ~36GB (batch size 2) for SAT-Nano.
Links: Checkpoints are available at Hugging Face (zzh99/SAT).

Highlighted Details

Supports segmentation of 497 classes across 3 modalities (MR, CT, PET) and 8 body regions.
Utilizes text prompts (anatomical terminology) for segmentation guidance.
Serves as a baseline method for the CVPR 2025 "Foundation Models for Text-Guided 3D Biomedical Image Segmentation" challenge.
Built upon a large collection of 72 public 3D medical segmentation datasets.

Maintenance & Community

The project is associated with NPJ Digital Medicine and is a baseline for a prominent CVPR 2025 challenge, indicating active development and recognition. No specific community channels (Discord, Slack) are listed.

Licensing & Compatibility

The README does not explicitly state the software license. Compatibility for commercial use or closed-source linking is undetermined without a specified license.

Limitations & Caveats

High GPU memory requirements are necessary for both SAT-Pro (~34-62GB) and SAT-Nano (~24-36GB) during inference. Training demands substantial resources, recommending 8+ A100-80G GPUs for SAT-Nano and 16+ for SAT-Pro.

SAT by zhaoziheng

Explore Similar Projects

CVPR-MIA by MedAIerHHL

Awesome-Foundation-Models-in-Medical-Imaging by xmindflow

AbdomenAtlas by MrGiovanni

Hulu-Med by ZJUI-AI4H

M3D by BAAI-DCAI

finetune-SAM by mazurowski-lab

MOOSE by ENHANCE-PET

RadFM by chaoyi-wu

SAM-Med2D by OpenGVLab

ANTs by ANTsX

MedSAM by bowang-lab

Pytorch-UNet by milesial