Discover and explore top open-source AI tools and projects—updated daily.
ReinFlowFlow matching policy fine-tuning via online RL
Top 98.3% on SourcePulse
Summary
ReinFlow offers a flexible framework for fine-tuning flow matching policies using online reinforcement learning, specifically supporting Vision-Language-Action (VLA) models. It enables researchers and engineers to enhance pre-trained imitation learning policies with RL, improving performance on complex robotic tasks. The core benefit is efficient adaptation of flow-based models to downstream RL objectives.
How It Works
The key innovation is an end-to-end trained noise injection network, enabling tractable policy probabilities even with minimal denoising steps (1-4). ReinFlow first trains policies via imitation learning (behavior cloning) and then fine-tunes them with online RL. This approach is robust to discretization and Monte Carlo approximation errors inherent in few-step diffusion processes.
Quick Start & Requirements
Installation is detailed in installation/reinflow-setup.md, with experiment reproduction guides in ReproduceExps.md and ReproduceFigs.md. While specific dependencies like CUDA versions aren't listed, the project's scale (3B parameters, extensive robotics benchmarks) implies significant computational resources, likely high-end GPUs. Project website and arXiv paper (arXiv:2510.25889) are available.
Highlighted Details
Maintenance & Community
Authored by Tonghe Zhang et al., with contributions from the RLinf project. Code, checkpoints, and documentation are fully released. Direct community channels (e.g., Discord, Slack) are not specified.
Licensing & Compatibility
Released under the permissive MIT license, allowing broad compatibility for commercial use and integration into closed-source projects.
Limitations & Caveats
ReinFlow is explicitly designed for fine-tuning existing RL agents, not for training them from scratch. It may not be optimal for initial pre-training.
2 months ago
Inactive
AgentR1
allenzren
hiyouga