PowerPaint  by open-mmlab

Image inpainting model for versatile image editing tasks

created 1 year ago
944 stars

Top 39.7% on sourcepulse

GitHubView on GitHub
Project Summary

PowerPaint is a versatile image inpainting model designed for researchers and practitioners in computer vision and generative AI. It offers a unified solution for text-guided object inpainting, object removal, shape-guided object insertion, and outpainting, all within a single model, simplifying complex image editing workflows.

How It Works

PowerPaint leverages task-specific prompts to guide its inpainting process, enabling it to handle diverse editing tasks with a single architecture. It builds upon the BrushNet framework, preserving cross-attention layers for prompt integration, which allows for fine-grained control over the inpainting results, particularly in shape-guided generation.

Quick Start & Requirements

  • Installation: Clone the repository, create a conda environment (conda create --name ppt python=3.9), activate it (conda activate ppt), and install dependencies (pip install -r requirements/requirements.txt or conda env create -f requirements/ppt.yaml).
  • Prerequisites: CUDA 11.8, Python 3.9. Git LFS is required for downloading model weights.
  • Inference: Launch the Gradio interface with python app.py --share. For PowerPaint-V2, use python app.py --share --version ppt-v2 --checkpoint_dir checkpoints/ppt-v2. Model weights can be downloaded from Hugging Face.
  • Resources: Requires significant GPU resources for inference and training.
  • Links: Online Demo, Model Weights

Highlighted Details

  • Supports text-guided object inpainting, object removal, shape-guided object insertion, and outpainting.
  • Compatible with ControlNet for generating objects with control images (e.g., Canny, Depth, HED, Pose).
  • Offers a "fitting degree" parameter for shape-guided inpainting to control object adherence to mask shapes.
  • PowerPaint-V2, built on BrushNet with RealisticVision, offers higher visual quality.

Maintenance & Community

The project is associated with OpenMMLab and has active development with recent updates in May 2024. Contact information for key contributors is provided.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Training PowerPaint-V1 requires a large batch size (e.g., 1024), while V2 is more memory-efficient. The README mentions potential logical errors in ControlNet loading were rectified, suggesting past stability issues.

Health Check
Last commit

10 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
2
Star History
66 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
1 more.

EditAnything by sail-sg

0.1%
3k
Image editing research paper using segmentation and diffusion
created 2 years ago
updated 5 months ago
Feedback? Help us improve.