BiRefNet by ZhengPeng7

High-resolution image segmentation and matting model

Created 3 years ago

3,180 stars

Top 14.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

Omar Sanseviero

DevRel at Google DeepMind

Project Summary

BiRefNet addresses high-resolution dichotomous image segmentation and general matting. Its "Bilateral Reference" approach targets researchers and engineers seeking state-of-the-art performance and flexibility across various segmentation applications, offering readily available models and deployment tools.

How It Works

The core "Bilateral Reference" mechanism leverages contextual information for high-resolution segmentation, achieving state-of-the-art results on DIS, COD, and HRSOD benchmarks. The project extends beyond academic scope with general-purpose and specialized models for matting and higher resolutions, enhancing real-world applicability.

Quick Start & Requirements

Installation requires Python 3.10 and PyTorch with CUDA. Recommended PyTorch versions are 2.5.1+CUDA12.4 or 2.0.1+CUDA11.8. Training demands significant GPU resources (e.g., 22.5GB+ VRAM for FP16 training), while inference is more accessible (~3.45GB VRAM for 1024x1024 FP16). Models are loadable via Hugging Face Transformers (AutoModelForImageSegmentation.from_pretrained), with Colab demos available for inference and ONNX conversion.

Highlighted Details

SOTA Performance: Achieves leading results on DIS, COD, and HRSOD benchmarks.
Model Zoo: Offers pre-trained weights for general use, matting, DIS, COD, and HRSOD, supporting various backbones (Swin_v1, PVT_v2) and resolutions up to 2560x1440.
Flexible Deployment: Supports FP16 inference, ONNX, and TensorRT optimization for faster deployment. Models are easily loadable via Hugging Face.
Fine-tuning: Enables customization on user datasets with a detailed tutorial available on YouTube.
Advanced Features: Includes support for box-guided segmentation and community integrations with platforms like ComfyUI and Stable Diffusion WebUI.

Maintenance & Community

Actively maintained by university researchers and supported by industry partners like Freepik and Features and Labels Inc. A Discord community is available for discussions. The project has seen significant community contributions, including integrations and re-implementations in different frameworks.

Licensing & Compatibility

The specific open-source license for the BiRefNet repository is not explicitly stated in the provided README. This lack of clarity is a critical factor for potential adopters evaluating commercial use or derivative works.

Limitations & Caveats

Training BiRefNet requires substantial GPU memory and computational resources. While ONNX conversion is supported, it leads to slower inference compared to native PyTorch. The absence of a defined license in the README presents a significant adoption blocker, preventing a definitive assessment of its compatibility for commercial or closed-source applications.

Health Check

Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

74 stars in the last 30 days