SAT  by zhaoziheng

Universal 3D medical image segmentation via text prompts

Created 2 years ago
271 stars

Top 95.1% on SourcePulse

GitHubView on GitHub
Project Summary

Large-vocabulary segmentation for medical images is addressed by SAT, a knowledge-enhanced universal segmentation model. It targets researchers and practitioners needing versatile segmentation across multiple modalities and anatomical structures. The primary benefit is a single, efficient model capable of segmenting hundreds of classes, replacing the need for numerous specialist models.

How It Works

SAT implements a knowledge-enhanced universal segmentation approach. It is built upon an extensive collection of 72 public 3D medical segmentation datasets, enabling segmentation of 497 classes across MR, CT, and PET modalities. Segmentation is guided by text prompts using anatomical terminology, offering a unified and efficient solution compared to training individual specialist models.

Quick Start & Requirements

  • Installation: Install dependencies via pip install -e dynamic-network-architectures-main (from the model directory). Key requirements include torch>=1.10.0, numpy==1.21.5, monai==1.1.0, transformers==4.21.3, nibabel==4.0.2, einops==0.6.1, and positional_encodings==6.0.1. Install mamba_ssm for the U-Mamba variant.
  • Prerequisites: Specific Python library versions are required. Checkpoints for SAT and the Text Encoder must be downloaded from Hugging Face. Data preparation follows a specified jsonl format.
  • GPU Memory: Inference requires significant GPU memory: ~34GB (batch size 1) to ~62GB (batch size 2) for SAT-Pro, and ~24GB (batch size 1) to ~36GB (batch size 2) for SAT-Nano.
  • Links: Checkpoints are available at Hugging Face (zzh99/SAT).

Highlighted Details

  • Supports segmentation of 497 classes across 3 modalities (MR, CT, PET) and 8 body regions.
  • Utilizes text prompts (anatomical terminology) for segmentation guidance.
  • Serves as a baseline method for the CVPR 2025 "Foundation Models for Text-Guided 3D Biomedical Image Segmentation" challenge.
  • Built upon a large collection of 72 public 3D medical segmentation datasets.

Maintenance & Community

The project is associated with NPJ Digital Medicine and is a baseline for a prominent CVPR 2025 challenge, indicating active development and recognition. No specific community channels (Discord, Slack) are listed.

Licensing & Compatibility

The README does not explicitly state the software license. Compatibility for commercial use or closed-source linking is undetermined without a specified license.

Limitations & Caveats

High GPU memory requirements are necessary for both SAT-Pro (~34-62GB) and SAT-Nano (~24-36GB) during inference. Training demands substantial resources, recommending 8+ A100-80G GPUs for SAT-Nano and 16+ for SAT-Pro.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
12 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Elvis Saravia Elvis Saravia(Founder of DAIR.AI).

SAM-Med2D by OpenGVLab

0.2%
1k
Medical image segmentation model based on SAM
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.