ReinFlow  by ReinFlow

Flow matching policy fine-tuning via online RL

Created 9 months ago
257 stars

Top 98.3% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

ReinFlow offers a flexible framework for fine-tuning flow matching policies using online reinforcement learning, specifically supporting Vision-Language-Action (VLA) models. It enables researchers and engineers to enhance pre-trained imitation learning policies with RL, improving performance on complex robotic tasks. The core benefit is efficient adaptation of flow-based models to downstream RL objectives.

How It Works

The key innovation is an end-to-end trained noise injection network, enabling tractable policy probabilities even with minimal denoising steps (1-4). ReinFlow first trains policies via imitation learning (behavior cloning) and then fine-tunes them with online RL. This approach is robust to discretization and Monte Carlo approximation errors inherent in few-step diffusion processes.

Quick Start & Requirements

Installation is detailed in installation/reinflow-setup.md, with experiment reproduction guides in ReproduceExps.md and ReproduceFigs.md. While specific dependencies like CUDA versions aren't listed, the project's scale (3B parameters, extensive robotics benchmarks) implies significant computational resources, likely high-end GPUs. Project website and arXiv paper (arXiv:2510.25889) are available.

Highlighted Details

  • Supports fine-tuning advanced VLA models like NVIDIA's GR00T, $\pi_0$, and $\pi_{0.5}$.
  • Achieves strong performance on legged locomotion (OpenAI Gym), state-based manipulation (Franka Kitchen), and visual manipulation (Robomimic).
  • End-to-end noise injection network ensures tractability with few denoising steps and robustness to approximation errors.
  • Compatible with 1-Rectified Flow, Shortcut Models, and other ODE-defined policies.
  • Full training metrics available via WandB; Robomimic rendering bugs fixed.
  • Scaled to 3 billion parameters with RLinf project support.

Maintenance & Community

Authored by Tonghe Zhang et al., with contributions from the RLinf project. Code, checkpoints, and documentation are fully released. Direct community channels (e.g., Discord, Slack) are not specified.

Licensing & Compatibility

Released under the permissive MIT license, allowing broad compatibility for commercial use and integration into closed-source projects.

Limitations & Caveats

ReinFlow is explicitly designed for fine-tuning existing RL agents, not for training them from scratch. It may not be optimal for initial pre-training.

Health Check
Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
16 stars in the last 30 days

Explore Similar Projects

Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA) and Alex Chen Alex Chen(Cofounder of Nexa AI).

EasyR1 by hiyouga

0.5%
5k
RL training framework for multi-modality models
Created 1 year ago
Updated 1 day ago
Feedback? Help us improve.