segformer-pytorch  by bubbliiiing

PyTorch code for SegFormer semantic segmentation

created 3 years ago
403 stars

Top 73.0% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a PyTorch implementation of the SegFormer semantic segmentation model, enabling users to train custom models. It targets researchers and practitioners in computer vision who need a flexible and performant semantic segmentation solution. The project offers pre-trained weights and clear instructions for training and inference.

How It Works

The implementation follows the SegFormer architecture, which utilizes a hierarchical Transformer encoder and a lightweight MLP decoder. This design avoids positional encodings and relies on self-attention mechanisms, leading to improved efficiency and scalability compared to traditional CNN-based segmentation models. The project supports multiple backbones (b0-b5) and various training configurations.

Quick Start & Requirements

  • Install via pip install torch==1.2.0.
  • Requires PyTorch 1.2.0.
  • Pre-trained weights and datasets are available via Baidu NetDisk links provided in the README.
  • Training and inference scripts (train.py, predict.py, get_miou.py) are included.
  • Official SegFormer paper: https://arxiv.org/abs/2105.15203

Highlighted Details

  • Supports training with evaluation, multiple backbones, step and cosine learning rate decay, Adam/SGD optimizers, and adaptive learning rates based on batch size.
  • Achieves mIOU of 73.34 (b0) to 80.38 (b2) on VOC12+SBD dataset with 512x512 input.
  • Includes functionality for FPS testing, batch inference, and video detection.
  • Supports custom dataset training with VOC format.

Maintenance & Community

The repository appears to be actively maintained by the author bubbliiiing, who also maintains related PyTorch implementations for Unet, PSPnet, and DeepLabv3+. No specific community channels (Discord/Slack) are mentioned.

Licensing & Compatibility

The repository does not explicitly state a license. However, it references the official NVlabs/SegFormer repository, which is typically Apache 2.0 licensed. Users should verify licensing for commercial use.

Limitations & Caveats

The project requires a specific, older version of PyTorch (1.2.0), which may pose compatibility issues with newer libraries or hardware. The primary download source for weights and datasets is Baidu NetDisk, which may be inaccessible or inconvenient for some users.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
41 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

fms-fsdp by foundation-model-stack

0.4%
258
Efficiently train foundation models with PyTorch
created 1 year ago
updated 1 week ago
Feedback? Help us improve.