Discover and explore top open-source AI tools and projects—updated daily.
Unified mask-guided image generation and editing
Top 94.1% on SourcePulse
Summary
OneReward introduces a novel Reinforcement Learning from Human Feedback (RLHF) methodology for visual tasks, leveraging Qwen2.5-VL as a generative reward model. This approach enhances multi-task reinforcement learning policies, enabling the Seedream 3.0 Fill model to achieve state-of-the-art performance in unified image editing. It offers significant improvements across diverse tasks like image fill, extend, object removal, and text rendering, surpassing leading commercial and open-source alternatives. The project targets researchers and developers seeking advanced image manipulation capabilities.
How It Works
The core innovation lies in applying RLHF to the visual domain using Qwen2.5-VL for reward modeling. This enhances a policy model's generation ability across multiple subtasks. The resulting Seedream 3.0 Fill model is a unified image editing system designed for versatility. Key architectural choices involve integrating a transformer-based reward model with diffusion pipelines for fine-grained control.
Quick Start & Requirements
Installation requires transformers>=4.51.3
and diffusers>=0.35.0
, installable via pip install -U diffusers
. GPU acceleration is recommended, indicated by torch_dtype=torch.bfloat16
and .to("cuda")
in examples. Inference demos (demo_one_reward.py
, demo_one_reward_dynamic.py
) are provided.
Highlighted Details
FLUX.1-Fill-dev-OneReward
checkpoint surpasses closed-source FLUX Fill [Pro] in inpainting and outpainting.FLUX.1-Fill-dev[OneReward]
and FLUX.1-Fill-dev[OneRewardDynamic]
.Maintenance & Community
The project roadmap indicates ongoing development with planned releases for text-to-image checkpoints and ComfyUI support. No specific community channels (Discord, Slack) or contributor/sponsorship information are detailed in the provided README.
Licensing & Compatibility
The codebase is licensed under Apache 2.0. However, the models are released under CC BY NC 4.0, which strictly prohibits commercial use and requires attribution. This license may restrict integration into proprietary or commercial applications.
Limitations & Caveats
The base model shows limited improvement for object removal tasks; a separate LoRA is necessary for better performance. Certain planned features, such as text-to-image checkpoints and ComfyUI support, are still pending release. The associated arXiv paper is also marked for future release.
1 month ago
Inactive