InSPyReNet  by plemeri

PyTorch implementation for high-resolution salient object detection

created 3 years ago
656 stars

Top 51.9% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official PyTorch implementation of InSPyReNet, a novel framework for high-resolution salient object detection (HR-SOD). It addresses the challenge of HR-SOD without requiring HR datasets by employing an image pyramid structure and a unique pyramid blending method to overcome receptive field discrepancies. The target audience includes researchers and practitioners in computer vision focused on image segmentation and object detection.

How It Works

InSPyReNet utilizes an image pyramid structure to generate saliency maps at multiple resolutions. A key innovation is its pyramid blending method, which synthesizes results from LR and HR image scales. This approach is designed to mitigate the effective receptive field (ERF) discrepancy between different resolutions, enabling accurate HR prediction without direct HR training data.

Quick Start & Requirements

  • Install: pip install transparent-background
  • Prerequisites: PyTorch. Specific backbone requirements (e.g., Res2Net, Swin Transformer) are used in provided models.
  • Data Download: python utils/download.py --extra --dest [DEST]
  • Resources: Official documentation for training, testing, and inference is available at getting_started.md. Model Zoo and pre-computed results are detailed in model_zoo.md. A web demo is available via HuggingFace.

Highlighted Details

  • Achieves state-of-the-art performance on various SOD metrics and boundary accuracy for HR images.
  • Offers a command-line tool and Python API via the transparent-background package.
  • Extended for lane segmentation in driving scenes (LaneSOD repository).
  • Supports multiple backbones including Res2Net and Swin Transformer.

Maintenance & Community

The project was presented at ACCV2022. A web demo is available on HuggingFace, provided by TasksWithCode.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify a license, which may impact commercial adoption. Compatibility details for closed-source integration are also absent.

Health Check
Last commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
57 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Wei-Lin Chiang Wei-Lin Chiang(Cofounder of LMArena), and
7 more.

dalle-mini by borisdayma

0.1%
15k
Text-to-image model for generating images from text prompts
created 4 years ago
updated 1 year ago
Feedback? Help us improve.