MixGRPO  by Tencent-Hunyuan

Enhancing generative model efficiency with mixed ODE-SDE

Created 3 months ago
1,030 stars

Top 36.4% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

MixGRPO enhances flow-based Generative Reward Policy Optimization (GRPO) efficiency using a novel mixed Ordinary Differential Equation (ODE) and Stochastic Differential Equation (SDE) approach. Targeting researchers and practitioners in generative AI, it aims to improve performance and unlock new capabilities, particularly in text-to-image generation tasks.

How It Works

The project employs a hybrid ODE-SDE formulation to optimize flow-based GRPO. This strategy combines deterministic ODE modeling with stochastic SDEs, aiming for more effective and faster policy optimization in generative models. The specific architecture and data flow are detailed in the associated paper.

Quick Start & Requirements

  • Installation: Python 3.12 via Conda (conda create -n MixGRPO python=3.12). System dependencies include pdsh, pssh, mesa-libGL (CentOS), and env_setup.sh.
  • Prerequisites: Hugging Face CLI (login required), Weights & Biases (WandB) key. Requires downloading FLUX.1-dev, HPS-v2.1, ImageReward, Pick Score, and CLIP Score reward models.
  • Hardware: Training supports multi-node setups (e.g., 4 nodes, 32 GPUs) using pdsh and torchrun. Inference/evaluation use single-node scripts.
  • Links: Paper: https://arxiv.org/abs/2507.21802. FLUX Model: black-forest-labs/FLUX.1-dev. HPSv2 Code: https://github.com/tgxs002/HPSv2.git. MixGRPO Weights: tulvgengenr/MixGRPO.

Highlighted Details

  • Novel mixed ODE-SDE approach for flow-based GRPO efficiency.
  • Supports multi-reward fine-tuning (HPSv2, ImageReward, Pick Score) on FLUX.1 Dev.
  • Provides scripts for data preprocessing, multi-node training, inference, and evaluation.

Maintenance & Community

No specific maintenance details, community channels, or roadmap links are provided in the README. The project is associated with Tencent Hunyuan.

Licensing & Compatibility

  • License: "License Terms of MixGRPO" (details in ./License.txt). Specific terms require consulting the License.txt file.
  • Compatibility: No explicit notes on commercial use or closed-source linking are present; license terms will govern.

Limitations & Caveats

  • Project is under active development; TODOs include updating technical reports and FlowGRPO comparisons.
  • License terms require consulting License.txt and may impose restrictions.
  • Multi-node training setup is complex, requiring specific environment variables and cluster tools.
  • Model downloads necessitate huggingface-cli login.
Health Check
Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
4
Star History
162 stars in the last 30 days

Explore Similar Projects

Starred by Lilian Weng Lilian Weng(Cofounder of Thinking Machines Lab), Patrick Kidger Patrick Kidger(Core Contributor to JAX ecosystem), and
12 more.

glow by openai

0.0%
3k
Generative flow research paper code
Created 7 years ago
Updated 1 year ago
Feedback? Help us improve.