WebUI extension for Stable Diffusion image segmentation tasks
Top 14.1% on sourcepulse
This extension integrates the Segment Anything Model (SAM) and GroundingDINO into the AUTOMATIC1111 Stable Diffusion WebUI, enabling advanced image segmentation and manipulation. It targets users seeking to enhance inpainting, semantic segmentation, automated matting, and LoRA/LyCORIS training set creation within Stable Diffusion workflows.
How It Works
The extension leverages SAM for generating masks from point or box prompts, and GroundingDINO for text-guided object detection and segmentation. It seamlessly integrates with the Mikubill ControlNet extension, allowing generated masks to be directly used for ControlNet inpainting and semantic segmentation tasks. This approach automates complex masking processes and provides fine-grained control over image generation.
Quick Start & Requirements
git clone
into the ${sd-webui}/extensions
directory or through the WebUI's extension tab.22bcc7be
or later) and Mikubill ControlNet Extension.${sd-webui}/models/sam
or ${sd-webui-segment-anything}/models/sam
.Highlighted Details
Maintenance & Community
The project appears to be in a maintenance phase with ongoing issue resolution and monitoring for new research. Community contributions and feature requests are welcomed. Links to demos and usage guides are provided.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility with commercial use or closed-source linking is not specified.
Limitations & Caveats
The extension is noted as not thoroughly tested, with potential bugs. Certain features like color inpainting and explicit mask editing are limited by Gradio/WebUI capabilities. Anime image layout generation performance is poor. GroundingDINO installation can be problematic, though a local installation option is provided to mitigate C++/CUDA compilation issues.
1 year ago
1 day