PowerPaint  by open-mmlab

Image inpainting model for versatile image editing tasks

Created 2 years ago
1,079 stars

Top 34.7% on SourcePulse

GitHubView on GitHub
Project Summary

PowerPaint is a versatile image inpainting model designed for researchers and practitioners in computer vision and generative AI. It offers a unified solution for text-guided object inpainting, object removal, shape-guided object insertion, and outpainting, all within a single model, simplifying complex image editing workflows.

How It Works

PowerPaint leverages task-specific prompts to guide its inpainting process, enabling it to handle diverse editing tasks with a single architecture. It builds upon the BrushNet framework, preserving cross-attention layers for prompt integration, which allows for fine-grained control over the inpainting results, particularly in shape-guided generation.

Quick Start & Requirements

  • Installation: Clone the repository, create a conda environment (conda create --name ppt python=3.9), activate it (conda activate ppt), and install dependencies (pip install -r requirements/requirements.txt or conda env create -f requirements/ppt.yaml).
  • Prerequisites: CUDA 11.8, Python 3.9. Git LFS is required for downloading model weights.
  • Inference: Launch the Gradio interface with python app.py --share. For PowerPaint-V2, use python app.py --share --version ppt-v2 --checkpoint_dir checkpoints/ppt-v2. Model weights can be downloaded from Hugging Face.
  • Resources: Requires significant GPU resources for inference and training.
  • Links: Online Demo, Model Weights

Highlighted Details

  • Supports text-guided object inpainting, object removal, shape-guided object insertion, and outpainting.
  • Compatible with ControlNet for generating objects with control images (e.g., Canny, Depth, HED, Pose).
  • Offers a "fitting degree" parameter for shape-guided inpainting to control object adherence to mask shapes.
  • PowerPaint-V2, built on BrushNet with RealisticVision, offers higher visual quality.

Maintenance & Community

The project is associated with OpenMMLab and has active development with recent updates in May 2024. Contact information for key contributors is provided.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Training PowerPaint-V1 requires a large batch size (e.g., 1024), while V2 is more memory-efficient. The README mentions potential logical errors in ControlNet loading were rectified, suggesting past stability issues.

Health Check
Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 30 days

Explore Similar Projects

Starred by Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
2 more.

IP-Adapter by tencent-ailab

0.2%
7k
Adapter for image prompt in text-to-image diffusion models
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.