PowerPaint  by open-mmlab

Image inpainting model for versatile image editing tasks

Created 1 year ago
973 stars

Top 37.9% on SourcePulse

GitHubView on GitHub
Project Summary

PowerPaint is a versatile image inpainting model designed for researchers and practitioners in computer vision and generative AI. It offers a unified solution for text-guided object inpainting, object removal, shape-guided object insertion, and outpainting, all within a single model, simplifying complex image editing workflows.

How It Works

PowerPaint leverages task-specific prompts to guide its inpainting process, enabling it to handle diverse editing tasks with a single architecture. It builds upon the BrushNet framework, preserving cross-attention layers for prompt integration, which allows for fine-grained control over the inpainting results, particularly in shape-guided generation.

Quick Start & Requirements

  • Installation: Clone the repository, create a conda environment (conda create --name ppt python=3.9), activate it (conda activate ppt), and install dependencies (pip install -r requirements/requirements.txt or conda env create -f requirements/ppt.yaml).
  • Prerequisites: CUDA 11.8, Python 3.9. Git LFS is required for downloading model weights.
  • Inference: Launch the Gradio interface with python app.py --share. For PowerPaint-V2, use python app.py --share --version ppt-v2 --checkpoint_dir checkpoints/ppt-v2. Model weights can be downloaded from Hugging Face.
  • Resources: Requires significant GPU resources for inference and training.
  • Links: Online Demo, Model Weights

Highlighted Details

  • Supports text-guided object inpainting, object removal, shape-guided object insertion, and outpainting.
  • Compatible with ControlNet for generating objects with control images (e.g., Canny, Depth, HED, Pose).
  • Offers a "fitting degree" parameter for shape-guided inpainting to control object adherence to mask shapes.
  • PowerPaint-V2, built on BrushNet with RealisticVision, offers higher visual quality.

Maintenance & Community

The project is associated with OpenMMLab and has active development with recent updates in May 2024. Contact information for key contributors is provided.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Training PowerPaint-V1 requires a large batch size (e.g., 1024), while V2 is more memory-efficient. The README mentions potential logical errors in ControlNet loading were rectified, suggesting past stability issues.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
20 stars in the last 30 days

Explore Similar Projects

Starred by Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
2 more.

IP-Adapter by tencent-ailab

0.3%
6k
Adapter for image prompt in text-to-image diffusion models
Created 2 years ago
Updated 1 year ago
Starred by Robin Huang Robin Huang(Cofounder of Comfy Org), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
17 more.

stablediffusion by Stability-AI

0.1%
42k
Latent diffusion model for high-resolution image synthesis
Created 2 years ago
Updated 2 months ago
Feedback? Help us improve.