FireRed-Image-Edit by FireRedTeam

State-of-the-art image editing model

Created 5 months ago

1,296 stars

Top 30.1% on SourcePulse

Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> FireRed-Image-Edit is a general-purpose image editing model designed for high-fidelity and consistent edits across various scenarios. It targets researchers and developers seeking advanced open-source image manipulation capabilities, offering leading performance in instruction following, image quality, and text style preservation.

How It Works

The model is built upon an open-source text-to-image foundation (currently Qwen-Image) and employs a novel training paradigm involving Pretraining, Supervised Fine-Tuning (SFT), and Reinforcement Learning (RL). This approach allows for native editing capabilities, ensuring accurate instruction following and visual coherence, and is designed to be backbone-agnostic for potential application to other text-to-image models.

Quick Start & Requirements

Primary install / run command:

pip install git+https://github.com/huggingface/diffusers

Example usage:

python inference.py \
    --input_image ./examples/edit_example.png \
    --prompt "在书本封面Python的下方，添加一行英文文字2nd Edition" \
    --output_image output_edit.png \
    --seed 43

Non-default prerequisites and dependencies: Requires the diffusers library. No other specific hardware or software dependencies are explicitly listed for basic setup.
Links:
- HuggingFace: https://huggingface.co/FireRedTeam/FireRed-Image-Edit-1.0
- ModelScope: https://modelscope.cn/models/FireRedTeam/FireRed-Image-Edit-1.0
- Technical Report: https://github.com/FireRedTeam/FireRed-Image-Edit/blob/main/assets/FireRed_Image_Edit_1_0_Techinical_Report.pdf

Highlighted Details

Achieves state-of-the-art results among open-source models on ImgEdit, Gedit, and RedEdit benchmarks.
Demonstrates leading performance in prompt following and visual consistency, comparable to closed-source solutions.
Excels at text style preservation with high fidelity and offers high-quality photo restoration capabilities.
Supports multi-image editing, such as virtual try-on scenarios.

Maintenance & Community

The project has released model weights and a technical report. Future releases are planned for a distilled version and a text-to-image foundation model. No explicit community channels (e.g., Discord, Slack) or active contributor details are provided in the README.

Licensing & Compatibility

License type: Apache 2.0.
Compatibility notes: Apache 2.0 is generally permissive for commercial use and closed-source linking, allowing broad adoption.

Limitations & Caveats

The project is actively under development, with several features listed as "To be released" in the TODO section, including a distilled model, the REDEdit-Bench dataset, and the core FireRed T2I foundation model. The Ethics Statement highlights that the model has not been comprehensively evaluated for all downstream applications and warns against prohibited uses (illegal, defamatory, pornographic, harmful content).

Health Check

Last Commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

43 stars in the last 30 days