ICEdit  by River-Zhang

Image editing with LoRA fine-tuning

created 3 months ago
1,856 stars

Top 23.8% on sourcepulse

GitHubView on GitHub
Project Summary

ICEdit enables state-of-the-art instruction-based image editing using significantly less training data and parameters than prior methods. It targets researchers and users seeking efficient, high-fidelity image manipulation, offering comparable or superior performance to commercial models in identity preservation and instruction following.

How It Works

ICEdit leverages a novel in-context generation approach within a Diffusion Transformer architecture. By training with a drastically reduced dataset (0.5% of prior methods), it achieves remarkable efficiency. This method focuses on precise instruction adherence and identity persistence, outperforming models like GPT-4o in these aspects.

Quick Start & Requirements

  • Install: pip install -r requirements.txt and pip install -U huggingface_hub.
  • Prerequisites: Python 3.10, Conda environment recommended. Pretrained weights for Flux.1-fill-dev and ICEdit-normal-LoRA are required.
  • Hardware: 4GB VRAM is sufficient for ComfyUI-nunchaku workflow. Standard inference on a 512x768 image requires 35GB VRAM, with an option for --enable-model-cpu-offload for 24GB GPUs.
  • Resources: Official Hugging Face demo available. ComfyUI workflows provided. Paper.

Highlighted Details

  • Achieves state-of-the-art instruction-based editing with minimal training data (0.5%) and parameters (1%).
  • Outperforms commercial models like GPT-4o in identity persistence and instruction following.
  • Offers fast inference (approx. 9 seconds per image) and low cost.
  • Supports ComfyUI integration with workflows for both standard LoRA and moe-lora (moe-lora weights temporarily withdrawn).

Maintenance & Community

  • Active development with recent updates and community contributions (e.g., ComfyUI workflows).
  • Hugging Face trending #2 weekly as of May 6, 2025.
  • Chinese tutorial video available.
  • Project Page.

Licensing & Compatibility

  • The repository itself does not explicitly state a license in the README. The BibTeX entry indicates it's an arXiv paper, typically implying research use. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The model is primarily trained on realistic images; performance may degrade on non-realistic styles like anime or blurry pictures. Object removal success rate is noted as relatively lower due to dataset limitations. The original moe-lora weights are temporarily withdrawn due to cooperation issues.

Health Check
Last commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
2
Star History
1,431 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.