IEEE_TPAMI_SpectralGPT by danfenghong

Foundation model for spectral remote sensing

Created 2 years ago

266 stars

Top 96.4% on SourcePulse

Project Summary

SpectralGPT: Spectral Remote Sensing Foundation Model

SpectralGPT addresses the gap in foundation models for spectral remote sensing (RS) data, offering a universal solution for enhanced scene understanding. It is designed for researchers and practitioners in geoscience and remote sensing, providing a powerful pretrained model to significantly advance downstream RS applications. The primary benefit is the ability to leverage large-scale spectral RS data for improved performance across various tasks.

How It Works

This project introduces a novel 3D generative pretrained transformer (GPT) architecture specifically for spectral RS images. SpectralGPT employs progressive training to accommodate diverse input characteristics like varying sizes, resolutions, and time series, enabling full utilization of extensive RS big data. Its core approach involves 3D token generation for robust spatial-spectral coupling and multi-target reconstruction to capture spectrally sequential patterns. This design allows for more effective feature extraction from complex spectral data.

Quick Start & Requirements

Installation: Install Python dependencies using pip install -r requirements.txt.
Prerequisites:
- Python environment.
- Significant GPU resources: Pretraining experiments utilized 8 NVIDIA GeForce RTX 4090 GPUs.
- Datasets: fMoW-Sentinel, BigEarthNet, EuroSAT, OSCD, and SegMunich are required for pretraining and finetuning. Download links are provided in the README.
Commands:
- Pretraining (fMoW-Sentinel): torchrun --nproc_per_node=8 main_pretrain.py ...
- Pretraining (BigEarthNet): torchrun --nproc_per_node=8 main_pretrain.py ...
- Finetuning (EuroSAT): torchrun --nproc_per_node=2 main_finetune.py ...
- Finetuning (OSCD): python train.py
- Finetuning (SegMunich): python -m torch.distributed.launch ... train_multi_GPU_new.py
Links:
- Paper: IEEE Xplore
- Model Checkpoints: Available for download.

Highlighted Details

Presents the first universal foundation model tailored for spectral remote sensing data.
Trained on a dataset comprising one million spectral RS images.
Models achieve over 600 million parameters.
Demonstrates significant performance improvements on four downstream tasks: single/multi-label scene classification, semantic segmentation, and change detection.

Maintenance & Community

The repository is associated with the IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2024 publication. Code inspiration is drawn from established repositories like Masked Autoencoders (MAE), SatMAE, Seasonal Contrast (SeCo), and Fully Convolutional Siamese Networks. No specific community channels (e.g., Discord, Slack) or roadmap links are provided in the README.

Licensing & Compatibility

The project is licensed under the GNU General Public License v3.0 (GPL v3). This is a strong copyleft license, meaning derivative works must also be distributed under the GPL v3. This may impose restrictions on integration into closed-source commercial products.

Limitations & Caveats

The pretraining process demands substantial computational resources, specifically multiple high-end GPUs (e.g., 8x RTX 4090). The GPL v3 license may present compatibility challenges for certain commercial or proprietary software integrations. Detailed commands for downstream tasks like OSCD and SegMunich are provided, but users will need to manage data paths and checkpoint configurations.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days