Discover and explore top open-source AI tools and projects—updated daily.
PKU-YuanGroupAdvanced image editing via diffusion negative-aware finetuning and MLLM feedback
Top 98.8% on SourcePulse
Summary
Edit-R1 enhances image editing by fine-tuning diffusion models with "Diffusion Negative-Aware Finetuning" and implicit feedback from Multimodal Large Language Models (MLLMs). This approach aims for precise control and higher quality AI-driven image manipulation, targeting researchers and developers.
How It Works
Leveraging the DiffusionNFT codebase, Edit-R1 fine-tunes diffusion models using a training-free reward model derived from pretrained MLLMs. This negative-aware feedback mechanism guides the fine-tuning process, improving the model's understanding of undesirable edits for more robust and accurate results.
Quick Start & Requirements
Installation requires cloning the repo, creating a Python 3.10.16 Conda environment, and running pip install -e .. Training necessitates deploying a reward server (python reward_server/reward_server.py) and configuring the REWARD_SERVER environment variable. Data requires a specific directory structure with images and metadata JSONL files.
Highlighted Details
UniWorld-Qwen-Image-Edit-2509 and UniWorld-FLUX.1-Kontext-Dev.Maintenance & Community
The project references Arxiv papers and a Hugging Face collection. No direct links to community forums or a public roadmap are provided in the README.
Licensing & Compatibility
The primary license is Apache. However, FLUX weights are under a "FLUX.1 [dev] Non-Commercial License," restricting their use in commercial applications.
Limitations & Caveats
The primary limitation is the non-commercial use restriction for FLUX weights. The README does not detail other potential limitations, unsupported platforms, or known bugs.
2 months ago
Inactive
bloc97