Pytorch-UNet  by milesial

PyTorch implementation for image semantic segmentation

Created 8 years ago
10,553 stars

Top 4.8% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a PyTorch implementation of the U-Net architecture, specifically tailored for high-definition image semantic segmentation tasks. It is suitable for researchers and practitioners working on challenges like the Carvana Image Masking competition, medical imaging, or portrait segmentation, offering a high-quality, pre-trained model and clear usage instructions.

How It Works

The implementation is a customized PyTorch version of the U-Net, a convolutional neural network architecture known for its effectiveness in biomedical image segmentation. It utilizes a U-shaped encoder-decoder structure with skip connections to preserve spatial information, enabling precise localization. The project supports automatic mixed precision (AMP) for faster training and reduced memory usage on compatible GPUs.

Quick Start & Requirements

  • Installation: pip install -r requirements.txt
  • Prerequisites: PyTorch 1.13 or later, CUDA.
  • Data: Download data using scripts/download_data.sh. Images and masks should be placed in data/imgs and data/masks respectively.
  • Docker: A Docker image is available on DockerHub for simplified setup.
  • Pretrained Model: Can be loaded via torch.hub.load('milesial/Pytorch-UNet', 'unet_carvana', pretrained=True, scale=0.5).
  • Documentation: Usage details for training and prediction are provided in the README.

Highlighted Details

  • Achieved a Dice coefficient of 0.988423 on over 100k test images for the Carvana dataset.
  • Supports automatic mixed precision (--amp) for performance gains.
  • Real-time training visualization via Weights & Biases integration.
  • Pretrained model available for the Carvana dataset via torch.hub.

Maintenance & Community

The project appears to be a personal implementation, with no explicit mention of maintainers, community channels (like Discord/Slack), or a public roadmap.

Licensing & Compatibility

The README does not explicitly state a license. However, the project is a PyTorch implementation of a well-known architecture, and its use for commercial purposes would depend on the licensing of the original U-Net paper and any specific license this repository might adopt.

Limitations & Caveats

The README does not specify a license, which may pose a barrier to commercial use. The data loader is described as "greedy," implying potential issues with complex directory structures or file naming conventions for custom datasets.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
126 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Kevin Hou Kevin Hou(Head of Product Engineering at Windsurf).

ImageAI by OlafenwaMoses

0.0%
9k
Python library for computer vision tasks
Created 7 years ago
Updated 1 year ago
Starred by Alexandr Wang Alexandr Wang(Chief AI Officer at Meta; Cofounder of Scale AI), Boris Cherny Boris Cherny(Creator of Claude Code; MTS at Anthropic), and
8 more.

awesome-deep-vision by kjw0612

0.1%
11k
Curated list of deep learning resources for computer vision
Created 10 years ago
Updated 2 years ago
Feedback? Help us improve.