segformer-pytorch by bubbliiiing

PyTorch code for SegFormer semantic segmentation

Created 3 years ago

463 stars

Top 65.4% on SourcePulse

Project Summary

This repository provides a PyTorch implementation of the SegFormer semantic segmentation model, enabling users to train custom models. It targets researchers and practitioners in computer vision who need a flexible and performant semantic segmentation solution. The project offers pre-trained weights and clear instructions for training and inference.

How It Works

The implementation follows the SegFormer architecture, which utilizes a hierarchical Transformer encoder and a lightweight MLP decoder. This design avoids positional encodings and relies on self-attention mechanisms, leading to improved efficiency and scalability compared to traditional CNN-based segmentation models. The project supports multiple backbones (b0-b5) and various training configurations.

Quick Start & Requirements

Install via pip install torch==1.2.0.
Requires PyTorch 1.2.0.
Pre-trained weights and datasets are available via Baidu NetDisk links provided in the README.
Training and inference scripts (train.py, predict.py, get_miou.py) are included.
Official SegFormer paper: https://arxiv.org/abs/2105.15203

Highlighted Details

Supports training with evaluation, multiple backbones, step and cosine learning rate decay, Adam/SGD optimizers, and adaptive learning rates based on batch size.
Achieves mIOU of 73.34 (b0) to 80.38 (b2) on VOC12+SBD dataset with 512x512 input.
Includes functionality for FPS testing, batch inference, and video detection.
Supports custom dataset training with VOC format.

Maintenance & Community

The repository appears to be actively maintained by the author bubbliiiing, who also maintains related PyTorch implementations for Unet, PSPnet, and DeepLabv3+. No specific community channels (Discord/Slack) are mentioned.

Licensing & Compatibility

The repository does not explicitly state a license. However, it references the official NVlabs/SegFormer repository, which is typically Apache 2.0 licensed. Users should verify licensing for commercial use.

Limitations & Caveats

The project requires a specific, older version of PyTorch (1.2.0), which may pose compatibility issues with newer libraries or hardware. The primary download source for weights and datasets is Baidu NetDisk, which may be inaccessible or inconvenient for some users.

segformer-pytorch by bubbliiiing

Explore Similar Projects

ArchScale by microsoft

X-VLM by zengyan-97

libai by Oneflow-Inc

yolov8-pytorch by bubbliiiing

multimodal by facebookresearch

yolov4-tiny-pytorch by bubbliiiing

yolov5-pytorch by bubbliiiing

yolox-pytorch by bubbliiiing

lingua by facebookresearch

GPT2 by ConnorJL

catalyst by catalyst-team

Semantic-Segmentation-Suite by GeorgeSeif