segformer-pytorch  by bubbliiiing

PyTorch code for SegFormer semantic segmentation

Created 3 years ago
422 stars

Top 69.8% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a PyTorch implementation of the SegFormer semantic segmentation model, enabling users to train custom models. It targets researchers and practitioners in computer vision who need a flexible and performant semantic segmentation solution. The project offers pre-trained weights and clear instructions for training and inference.

How It Works

The implementation follows the SegFormer architecture, which utilizes a hierarchical Transformer encoder and a lightweight MLP decoder. This design avoids positional encodings and relies on self-attention mechanisms, leading to improved efficiency and scalability compared to traditional CNN-based segmentation models. The project supports multiple backbones (b0-b5) and various training configurations.

Quick Start & Requirements

  • Install via pip install torch==1.2.0.
  • Requires PyTorch 1.2.0.
  • Pre-trained weights and datasets are available via Baidu NetDisk links provided in the README.
  • Training and inference scripts (train.py, predict.py, get_miou.py) are included.
  • Official SegFormer paper: https://arxiv.org/abs/2105.15203

Highlighted Details

  • Supports training with evaluation, multiple backbones, step and cosine learning rate decay, Adam/SGD optimizers, and adaptive learning rates based on batch size.
  • Achieves mIOU of 73.34 (b0) to 80.38 (b2) on VOC12+SBD dataset with 512x512 input.
  • Includes functionality for FPS testing, batch inference, and video detection.
  • Supports custom dataset training with VOC format.

Maintenance & Community

The repository appears to be actively maintained by the author bubbliiiing, who also maintains related PyTorch implementations for Unet, PSPnet, and DeepLabv3+. No specific community channels (Discord/Slack) are mentioned.

Licensing & Compatibility

The repository does not explicitly state a license. However, it references the official NVlabs/SegFormer repository, which is typically Apache 2.0 licensed. Users should verify licensing for commercial use.

Limitations & Caveats

The project requires a specific, older version of PyTorch (1.2.0), which may pose compatibility issues with newer libraries or hardware. The primary download source for weights and datasets is Baidu NetDisk, which may be inaccessible or inconvenient for some users.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
13 stars in the last 30 days

Explore Similar Projects

Starred by Théophile Gervet Théophile Gervet(Cofounder of Genesis AI), Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), and
6 more.

lingua by facebookresearch

0.1%
5k
LLM research codebase for training and inference
Created 11 months ago
Updated 2 months ago
Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Casper Hansen Casper Hansen(Author of AutoAWQ), and
1 more.

GPT2 by ConnorJL

0%
1k
GPT2 training implementation, supporting TPUs and GPUs
Created 6 years ago
Updated 2 years ago
Feedback? Help us improve.