Awesome-Image-Editing by FudanCVL

Survey of multimodal-guided image editing with diffusion models

Created 1 year ago

465 stars

Top 65.2% on SourcePulse

Project Summary

This repository serves as a curated survey of multimodal-guided image editing techniques, specifically focusing on advancements leveraging text-to-image diffusion models. It aims to provide researchers and practitioners with a comprehensive overview and categorization of recent methods, aiding in the exploration and development of novel image editing approaches. The primary benefit is a structured, up-to-date resource for understanding the landscape of text-driven image manipulation.

How It Works

The project meticulously tracks and organizes recent research papers related to multimodal-guided image editing. It categorizes methods based on specific editing tasks (e.g., object manipulation, style change, inpainting) and provides details on their underlying inversion and editing algorithms, as well as the guidance sets employed. This structured approach allows for a clear understanding of the diverse techniques and their combinations within the field.

Quick Start & Requirements

This repository is a curated survey of research papers and does not provide a runnable software package. Therefore, there are no installation or setup requirements.

Highlighted Details

Comprehensive cataloging of recent multimodal-guided image editing methods using text-to-image diffusion models.
Detailed breakdown of editing tasks, including content-aware (object manipulation, spatial transformation, inpainting, style change, image translation) and content-free (subject-driven, attribute-driven) customization.
Classification of methods based on inversion algorithms (Tuning-Based, Forward-Based) and editing algorithms (Normal, Attention-Based, Blending-Based, Score-Based, Optimization-Based).
Links to associated code repositories and project pages for many surveyed methods.

Maintenance & Community

The repository is maintained by FudanCVL, with contact information provided via email (henghui.ding[AT]gmail.com). It encourages community contributions through pull requests for missing works or suggestions.

Licensing & Compatibility

No specific license information is provided in the README.

Limitations & Caveats

As a survey, the repository's primary limitation is that it reflects the state of research at the time of its last update and does not offer a unified, executable framework itself. The organization of methods primarily indicates the primary technology used, noting that many studies employ multiple algorithms simultaneously.

Health Check

Last Commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days