Robo-Dopamine by FlagOpen

General process reward modeling for high-precision robotic manipulation

Created 6 months ago

644 stars

Top 50.9% on SourcePulse

Project Summary

Robo-Dopamine provides a novel framework for high-precision robotic manipulation by introducing a General Process Reward Model (GRM) and an associated RL training system. It addresses the challenge of generating stable and accurate reward signals crucial for accelerating reinforcement learning in complex robotic tasks. The project is targeted at researchers and engineers in robotics and AI, offering a benefit of more efficient and effective robot learning through advanced vision-language modeling for reward prediction.

How It Works

The core of Robo-Dopamine is the General Reward Model (GRM), a vision-language model that predicts task progress by processing task descriptions and multi-view images of initial, goal, and intermediate states. It employs "Multi-Perspective Progress Fusion" to combine incremental, forward-anchored, and backward-anchored predictions into a robust reward signal. Complementing this is the Dopamine-RL training framework, which uses "One-Shot GRM Adaptation" to quickly adapt the GRM to new tasks and a "Policy-Invariant Reward Shaping" method to convert the GRM's dense output into an effective reward signal that speeds up learning without altering the optimal policy.

Quick Start & Requirements

Installation involves cloning the repository, creating a Conda environment with Python 3.10, activating it, and installing dependencies via pip install -r requirements.txt. The project requires CUDA version 12.8 or higher. Links to Hugging Face models (GRM-3B, GRM-8B, GRM-2.0-8B-Preview) and example usage scripts are provided within the README.

Highlighted Details

Accepted to CVPR 2026.
Offers multiple GRM models, including GRM-3B, GRM-8B, and the more versatile GRM-2.0-8B-Preview.
Includes a benchmark suite (Robo-Dopamine-Bench) and evaluation codes for assessing GRM performance.
Provides data generation pipelines and fine-tuning scripts to enable users to train GRMs on their own datasets.

Maintenance & Community

The project shows recent activity with news updates in March 2026, indicating active development. Specific community channels (e.g., Discord, Slack), sponsorships, or detailed roadmaps beyond the immediate TODO list are not explicitly mentioned in the README.

Licensing & Compatibility

The provided README does not specify a software license. This omission makes it impossible to determine compatibility for commercial use or closed-source integration without further clarification.

Limitations & Caveats

The GRM-2.0-8B model is currently in a "Preview" state, with a final version expected soon. Full GRM dataset and pre-training codes, as well as Dopamine-RL training codes for simulator and real-world settings, are still under development and slated for release in the coming months. The absence of a stated license is a significant adoption blocker.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

9 stars in the last 30 days