Image editing research paper using segmentation and diffusion
Top 14.6% on sourcepulse
This project provides a flexible framework for image editing and generation, leveraging state-of-the-art models like Segment Anything (SAM), ControlNet, and Stable Diffusion. It targets researchers and power users seeking advanced control over image manipulation, enabling tasks from style transfer to detailed object editing.
How It Works
The core approach combines SAM for precise, category-agnostic segmentation with ControlNet and Stable Diffusion for conditional image generation. This allows users to define regions of interest via segmentation masks and then guide the editing or generation process with text prompts. The system supports various conditioning methods, including SAM masks, text-guided object/part masks (via GroundingDINO), and even cross-image region merging for creative fusion.
Quick Start & Requirements
conda env create -f environment.yaml
and conda activate control
.python app.py
.Highlighted Details
Maintenance & Community
The project was accepted to the ACM MM demo track in July 2023 and has seen recent UI and code revisions. It welcomes contributions and suggestions.
Licensing & Compatibility
The project is released under a permissive license, compatible with commercial use and closed-source linking.
Limitations & Caveats
While the project supports various models and editing techniques, some functionalities like text-guided editing are marked as initial versions. Specific model requirements and download paths are detailed in the README.
5 months ago
Inactive