RSPrompter by KyanChen

PyTorch code for remote sensing instance segmentation via visual foundation models

Created 2 years ago

643 stars

Top 51.8% on SourcePulse

Project Summary

This repository provides a PyTorch implementation for RSPrompter, a method for remote sensing instance segmentation using visual foundation models. It targets researchers and practitioners in remote sensing and computer vision, offering a framework to leverage large foundation models for improved segmentation accuracy.

How It Works

RSPrompter builds upon the MMDetection framework, integrating Segment Anything Model (SAM) capabilities for instance segmentation. It introduces novel prompting techniques to adapt SAM for remote sensing data, allowing for efficient fine-tuning with methods like LoRA and variable input image sizes to manage memory usage.

Quick Start & Requirements

Installation: Clone the repository and install dependencies via pip and mim. Recommended environment: Python 3.10, PyTorch 2.1.x, CUDA 12.1, MMCV 2.1.x.
Prerequisites: Linux or Windows, Miniconda, PyTorch, MMCV, transformers, wandb, and other listed Python packages. DeepSpeed is optional for accelerated training.
Setup: Follow the detailed installation steps in the README.
Resources: Official documentation and Hugging Face Spaces for models are linked.

Highlighted Details

Consistent API and usage with MMDetection.
Open-source SAM-seg, SAM-det, and RSPrompter models.
Supports AMP, DeepSpeed for training.
Variable input image size and LoRA for memory efficiency.

Maintenance & Community

The project is actively developed, with recent updates in late 2023. Users can seek help via GitHub Issues.

Licensing & Compatibility

Licensed under Apache 2.0, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

DeepSpeed support is noted as imperfect on Windows. The README suggests that low-resolution inputs reduce memory but have not been performance-verified. Some configurations may require significant GPU memory (e.g., 20.9 GB for RSPrompter-query with 1024x1024 input on a single RTX 4090).

RSPrompter by KyanChen

Explore Similar Projects

Open-LLaVA-NeXT by xiaoachen98

LoRA-ViT by JamesQFreeman

segformer-pytorch by bubbliiiing

ChatGLM-finetune-LoRA by lich99

train-llm-from-scratch by FareedKhan-dev

yolov8-pytorch by bubbliiiing

yolov5-pytorch by bubbliiiing

yolox-pytorch by bubbliiiing

autodistill by autodistill

llm-foundry by mosaicml

minimind-v by jingyaogong

ms-swift by modelscope