Discover and explore top open-source AI tools and projects—updated daily.
danfenghongFoundation model for spectral remote sensing
Top 97.5% on SourcePulse
SpectralGPT: Spectral Remote Sensing Foundation Model
SpectralGPT addresses the gap in foundation models for spectral remote sensing (RS) data, offering a universal solution for enhanced scene understanding. It is designed for researchers and practitioners in geoscience and remote sensing, providing a powerful pretrained model to significantly advance downstream RS applications. The primary benefit is the ability to leverage large-scale spectral RS data for improved performance across various tasks.
How It Works
This project introduces a novel 3D generative pretrained transformer (GPT) architecture specifically for spectral RS images. SpectralGPT employs progressive training to accommodate diverse input characteristics like varying sizes, resolutions, and time series, enabling full utilization of extensive RS big data. Its core approach involves 3D token generation for robust spatial-spectral coupling and multi-target reconstruction to capture spectrally sequential patterns. This design allows for more effective feature extraction from complex spectral data.
Quick Start & Requirements
pip install -r requirements.txt.torchrun --nproc_per_node=8 main_pretrain.py ...torchrun --nproc_per_node=8 main_pretrain.py ...torchrun --nproc_per_node=2 main_finetune.py ...python train.pypython -m torch.distributed.launch ... train_multi_GPU_new.pyHighlighted Details
Maintenance & Community
The repository is associated with the IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2024 publication. Code inspiration is drawn from established repositories like Masked Autoencoders (MAE), SatMAE, Seasonal Contrast (SeCo), and Fully Convolutional Siamese Networks. No specific community channels (e.g., Discord, Slack) or roadmap links are provided in the README.
Licensing & Compatibility
The project is licensed under the GNU General Public License v3.0 (GPL v3). This is a strong copyleft license, meaning derivative works must also be distributed under the GPL v3. This may impose restrictions on integration into closed-source commercial products.
Limitations & Caveats
The pretraining process demands substantial computational resources, specifically multiple high-end GPUs (e.g., 8x RTX 4090). The GPL v3 license may present compatibility challenges for certain commercial or proprietary software integrations. Detailed commands for downstream tasks like OSCD and SegMunich are provided, but users will need to manage data paths and checkpoint configurations.
1 year ago
Inactive
NVlabs