Discover and explore top open-source AI tools and projects—updated daily.
PKU-YuanGroupTop 97.0% on SourcePulse
ImgEdit provides a large-scale, high-quality dataset and a comprehensive benchmark suite for image editing tasks, addressing the need for standardized training and evaluation in this domain. It targets AI researchers and developers working on generative models, offering a unified platform for advancing single-turn and multi-turn image manipulation capabilities. The project aims to facilitate the development of more sophisticated and instruction-adherent image editing models.
How It Works
The ImgEdit dataset is curated through a multi-stage pipeline. This process begins with filtering the Laion-aes dataset based on aesthetic scores, followed by dense and short caption generation using vision-language models like Qwen2.5VL-7B and GPT-4o. Object detection and segmentation are performed using YOLO-world and SAM2, with CLIP filtering applied. Diverse editing prompts are generated via GPT-4o, and task-specific editing pipelines, potentially leveraging ComfyUI and Stable Diffusion, are employed. Data quality is further refined through GPT-4o-based filtering. The ImgEdit-Bench benchmark evaluates models across basic, Understanding-Grounding-Editing (UGE), and multi-turn editing suites, assessing instruction adherence, editing quality, and content memory.
Quick Start & Requirements
sysuyy/ImgEdit, sysuyy/ImgEdit_recap_mask) or can be downloaded via huggingface-cli download. Tar packages may require merging (cat a.tar.split.* > a.tar).torch, transformers, and datasets for loading data. Specific model checkpoints (e.g., ImgEdit_Judge) require environment setup following Qwen2.5-VL.Highlighted Details
Maintenance & Community
Recent news (July 2025) indicates ongoing updates to the ImgEdit-Bench leaderboard with new model integrations. The project has open-sourced related work like UniWorld-V1. No direct community channels (e.g., Discord, Slack) are listed.
Licensing & Compatibility
No explicit license information is provided in the README. This omission requires clarification for adoption decisions, especially concerning commercial use or derivative works.
Limitations & Caveats
The release of data curation pipelines is marked as "WIP" (Work In Progress). The ImgEdit_Judge component requires specific environment setup aligned with Qwen2.5-VL, and its usage involves custom inference code. The absence of a stated license is a significant adoption blocker.
2 months ago
Inactive
timothybrooks