EditAnything by sail-sg

Image editing research paper using segmentation and diffusion

Created 2 years ago

3,429 stars

Top 14.1% on SourcePulse

View on GitHub

7 Experts Love This Project

Pawel Garbacki

Cofounder of Fireworks AI

Elvis Saravia

Founder of DAIR.AI

Gabriel Almeida

Cofounder of Langflow

Patrick von Platen

Author of Hugging Face Diffusers; Research Engineer at Mistral

and 3 more!

Project Summary

This project provides a flexible framework for image editing and generation, leveraging state-of-the-art models like Segment Anything (SAM), ControlNet, and Stable Diffusion. It targets researchers and power users seeking advanced control over image manipulation, enabling tasks from style transfer to detailed object editing.

How It Works

The core approach combines SAM for precise, category-agnostic segmentation with ControlNet and Stable Diffusion for conditional image generation. This allows users to define regions of interest via segmentation masks and then guide the editing or generation process with text prompts. The system supports various conditioning methods, including SAM masks, text-guided object/part masks (via GroundingDINO), and even cross-image region merging for creative fusion.

Quick Start & Requirements

Install via conda env create -f environment.yaml and conda activate control.
Requires Python, transformers, segment-anything, CLIP, detectron2, and GroundingDINO.
Pretrained models for SAM, BLIP2, and GroundingDINO are auto-downloaded or require manual download.
Demo can be run with python app.py.
Official Demo: https://huggingface.co/spaces/shgao/EditAnything

Highlighted Details

Supports cross-image region drag and merge for creative fusion.
Enables specific editing of clothes, haircuts, and colored contact lenses.
Offers sketch-to-image generation with adjustable mask alignment.
Integrates with DreamBooth for customized model training and editing.

Maintenance & Community

The project was accepted to the ACM MM demo track in July 2023 and has seen recent UI and code revisions. It welcomes contributions and suggestions.

Licensing & Compatibility

The project is released under a permissive license, compatible with commercial use and closed-source linking.

Limitations & Caveats

While the project supports various models and editing techniques, some functionalities like text-guided editing are marked as initial versions. Specific model requirements and download paths are detailed in the README.

Health Check

Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

7 stars in the last 30 days