BrushEdit by TencentARC

AI agent for image inpainting and editing

Created 1 year ago

586 stars

Top 55.4% on SourcePulse

Project Summary

BrushEdit is a unified AI agent for image inpainting and editing, targeting researchers and practitioners in computer vision and generative AI. It offers both automated and interactive editing capabilities, leveraging a pipeline that combines multi-modal large language models (MLLMs) with a dual-branch diffusion inpainting model (BrushNetX) for precise and context-aware image manipulation.

How It Works

BrushEdit employs a four-step pipeline: editing category classification, primary editing object identification, mask and target caption generation, and finally, image inpainting. Steps one through three utilize pre-trained MLLMs and detection models (GroundingDINO, SAM) to interpret user instructions, identify targets, and generate masks and descriptive captions. The core image editing is performed by BrushNetX, an enhanced diffusion model designed for high-fidelity inpainting and background preservation, guided by the generated masks and captions.

Quick Start & Requirements

Install: Clone the repository and install dependencies using pip install -e . and pip install -r app/requirements.txt.
Prerequisites: CUDA 11.8, PyTorch 2.0.1, Python 3.10.6. Requires downloading pre-trained checkpoints for BrushNetX, Stable Diffusion base models (e.g., RealisticVisionV60B1), GroundingDINO, SAM, and VLM models (e.g., Qwen2-VL-7B-Instruct).
Setup: Estimated setup time involves cloning, environment setup, and downloading checkpoints (size not specified).
Demo: Run with sh app/run_app.sh.
Links: Project Page, Arxiv, Video, Hugging Face Demo, Hugging Face Model.

Highlighted Details

Supports interactive mask manipulation (generation, square/circle, invert, dilate/erode, move).
Offers automated target prompt generation and manual editing.
Includes blending options for preserving original image details.
Maximum resolution is 1024px to prevent Out-of-Memory errors.
Recommends GPT-4o for reasoning, with Qwen2-VL-7B-Instruct as a secondary option.

Maintenance & Community

The project is associated with Tencent ARC, Peking University, The Chinese University of Hong Kong, and Tsinghua University.
Contact email: liyaowei01@gmail.com.

Licensing & Compatibility

The repository is released under an unspecified license. The README mentions modifications based on diffusers and BrushNet, which have their own licenses. Compatibility for commercial use or closed-source linking is not explicitly stated.

Limitations & Caveats

The project is marked as "TPAMI under review," suggesting it may still be in an experimental or pre-publication phase.
Specific licensing details for the BrushEdit code itself are not provided in the README, which could impact commercial adoption.
Maximum resolution is limited to 1024px.

Health Check

Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

1 stars in the last 30 days

Explore Similar Projects

ImgEdit by PKU-YuanGroup

Created 9 months ago

Updated 3 months ago

JarvisEvo by LYL1015

AI agent for synergistic photo editing

Created 3 months ago

Updated 3 days ago

OneReward by bytedance

Unified mask-guided image generation and editing

Created 6 months ago

Updated 5 months ago

Forgedit by witcherofresearch

Text-guided image editor via diffusion model fine-tuning

Created 2 years ago

Updated 1 year ago

Starred by

Robin Rombach

Robin Rombach(Cofounder of Black Forest Labs).

glid-3-xl by Jack000

Latent diffusion model for image generation and editing

Created 3 years ago

Updated 3 years ago

multimodal-garment-designer by aimagelab

AI model for fashion image editing via multimodal prompts

Created 2 years ago

Updated 1 year ago

HiDream-E1 by HiDream-ai

Image editing model for instruction-based manipulation

Created 10 months ago

Updated 7 months ago

ai-image-edit by chunxiuxiamo

Generate and edit images with AI precision

Created 2 months ago

Updated 2 weeks ago

PowerPaint by open-mmlab

Image inpainting model for versatile image editing tasks

Created 2 years ago

Updated 2 months ago

Starred by

Andreas Jansson

Andreas Jansson(Cofounder of Replicate),

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI), and

1 more.

CrossAttentionControl by bloc97

Image editing via cross-attention control in Stable Diffusion

Created 3 years ago

Updated 3 years ago

Starred by

Ettore Di Giacinto

Ettore Di Giacinto(Author of LocalAI) and

Simon Willison

Simon Willison(Coauthor of Django).

ml-mgie by apple

Image editing via multimodal LLMs (research paper)

Created 2 years ago

Updated 1 year ago

MagicQuill by ant-research

Interactive image editing system for precise manipulation (CVPR 2025 paper)

Created 1 year ago

Updated 2 months ago

Feedback? Help us improve.