MagicQuill  by ant-research

Interactive image editing system for precise manipulation (CVPR 2025 paper)

created 8 months ago
3,513 stars

Top 14.0% on sourcepulse

GitHubView on GitHub
Project Summary

MagicQuill is an intelligent, interactive image editing system designed for precise local edits, offering AI-powered suggestions and a user-friendly interface. It targets researchers and users seeking advanced image manipulation capabilities, enabling detailed control over edits through intuitive brush tools.

How It Works

MagicQuill leverages a combination of diffusion models and interactive brush strokes for image editing. Users can "add" elements, "subtract" unwanted parts, or precisely "color" regions. The system incorporates a "Draw and Guess" feature that predicts user intent from brush strokes, automatically filling prompts. This approach allows for fine-grained control over image generation and modification, moving beyond simple global adjustments.

Quick Start & Requirements

  • Install: Clone the repository with git clone --recursive https://github.com/magic-quill/MagicQuill.git, then follow setup scripts (windows_setup.bat or linux_setup.sh) or manual installation steps.
  • Prerequisites: Python 3.10, PyTorch 2.1.2 with CUDA 11.8 support, and approximately 25 GB for checkpoints. A GPU with at least 8GB VRAM is required.
  • Resources: Checkpoints download can be time-consuming.
  • Links: Demo Page, ComfyUI Node, Modelscope.

Highlighted Details

  • Accepted to CVPR 2025.
  • Supports multiple editing modes: add, subtract, and color brushes.
  • Features "Draw and Guess" for prompt auto-completion.
  • Offers fine-tuneable parameters for brush size, edge control, and generation strength.
  • Available as a Docker container for isolated environments.

Maintenance & Community

  • Active development with recent updates (Nov-Dec 2024) including UI enhancements and ComfyUI node release.
  • Mentions contributions from users like lior007, JamesIV4, and Furkan Gözükara.
  • No explicit community links (Discord/Slack) are provided in the README.

Licensing & Compatibility

  • Licensed under CC BY-NC 4.0.
  • Restrictions: Non-commercial use only. Prohibits generation of harmful content.

Limitations & Caveats

The CC BY-NC 4.0 license restricts commercial use. The system requires significant VRAM and checkpoint downloads, and the "Draw and Guess" feature may sometimes misinterpret user intent.

Health Check
Last commit

5 days ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
5
Star History
203 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
1 more.

EditAnything by sail-sg

0.1%
3k
Image editing research paper using segmentation and diffusion
created 2 years ago
updated 5 months ago
Feedback? Help us improve.