Pytorch-UNet by milesial

PyTorch implementation for image semantic segmentation

Created 8 years ago

11,060 stars

Top 4.6% on SourcePulse

View on GitHub

3 Experts Love This Project

Anastasios Angelopoulos

Author of LLaMA-Factory

Project Summary

This repository provides a PyTorch implementation of the U-Net architecture, specifically tailored for high-definition image semantic segmentation tasks. It is suitable for researchers and practitioners working on challenges like the Carvana Image Masking competition, medical imaging, or portrait segmentation, offering a high-quality, pre-trained model and clear usage instructions.

How It Works

The implementation is a customized PyTorch version of the U-Net, a convolutional neural network architecture known for its effectiveness in biomedical image segmentation. It utilizes a U-shaped encoder-decoder structure with skip connections to preserve spatial information, enabling precise localization. The project supports automatic mixed precision (AMP) for faster training and reduced memory usage on compatible GPUs.

Quick Start & Requirements

Installation: pip install -r requirements.txt
Prerequisites: PyTorch 1.13 or later, CUDA.
Data: Download data using scripts/download_data.sh. Images and masks should be placed in data/imgs and data/masks respectively.
Docker: A Docker image is available on DockerHub for simplified setup.
Pretrained Model: Can be loaded via torch.hub.load('milesial/Pytorch-UNet', 'unet_carvana', pretrained=True, scale=0.5).
Documentation: Usage details for training and prediction are provided in the README.

Highlighted Details

Achieved a Dice coefficient of 0.988423 on over 100k test images for the Carvana dataset.
Supports automatic mixed precision (--amp) for performance gains.
Real-time training visualization via Weights & Biases integration.
Pretrained model available for the Carvana dataset via torch.hub.

Maintenance & Community

The project appears to be a personal implementation, with no explicit mention of maintainers, community channels (like Discord/Slack), or a public roadmap.

Licensing & Compatibility

The README does not explicitly state a license. However, the project is a PyTorch implementation of a well-known architecture, and its use for commercial purposes would depend on the licensing of the original U-Net paper and any specific license this repository might adopt.

Limitations & Caveats

The README does not specify a license, which may pose a barrier to commercial use. The data loader is described as "greedy," implying potential issues with complex directory structures or file naming conventions for custom datasets.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

109 stars in the last 30 days