OneReward by bytedance

Unified mask-guided image generation and editing

Created 6 months ago

335 stars

Top 82.4% on SourcePulse

Project Summary

Summary

OneReward introduces a novel Reinforcement Learning from Human Feedback (RLHF) methodology for visual tasks, leveraging Qwen2.5-VL as a generative reward model. This approach enhances multi-task reinforcement learning policies, enabling the Seedream 3.0 Fill model to achieve state-of-the-art performance in unified image editing. It offers significant improvements across diverse tasks like image fill, extend, object removal, and text rendering, surpassing leading commercial and open-source alternatives. The project targets researchers and developers seeking advanced image manipulation capabilities.

How It Works

The core innovation lies in applying RLHF to the visual domain using Qwen2.5-VL for reward modeling. This enhances a policy model's generation ability across multiple subtasks. The resulting Seedream 3.0 Fill model is a unified image editing system designed for versatility. Key architectural choices involve integrating a transformer-based reward model with diffusion pipelines for fine-grained control.

Quick Start & Requirements

Installation requires transformers>=4.51.3 and diffusers>=0.35.0, installable via pip install -U diffusers. GPU acceleration is recommended, indicated by torch_dtype=torch.bfloat16 and .to("cuda") in examples. Inference demos (demo_one_reward.py, demo_one_reward_dynamic.py) are provided.

Highlighted Details

Seedream 3.0 Fill outperforms commercial systems like Ideogram and Adobe Photoshop, and open-source FLUX Fill [Pro].
The released FLUX.1-Fill-dev-OneReward checkpoint surpasses closed-source FLUX Fill [Pro] in inpainting and outpainting.
Offers specialized checkpoints: FLUX.1-Fill-dev[OneReward] and FLUX.1-Fill-dev[OneRewardDynamic].
A dedicated LoRA adapter is available to significantly improve object removal capabilities.

Maintenance & Community

The project roadmap indicates ongoing development with planned releases for text-to-image checkpoints and ComfyUI support. No specific community channels (Discord, Slack) or contributor/sponsorship information are detailed in the provided README.

Licensing & Compatibility

The codebase is licensed under Apache 2.0. However, the models are released under CC BY NC 4.0, which strictly prohibits commercial use and requires attribution. This license may restrict integration into proprietary or commercial applications.

Limitations & Caveats

The base model shows limited improvement for object removal tasks; a separate LoRA is necessary for better performance. Certain planned features, such as text-to-image checkpoints and ComfyUI support, are still pending release. The associated arXiv paper is also marked for future release.

Health Check

Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

7 stars in the last 30 days