RL from human preferences via a webapp for feedback collection
Top 58.1% on sourcepulse
This repository provides an implementation of Deep Reinforcement Learning from Human Preferences, enabling users to train RL agents for tasks lacking explicit reward functions. It's suitable for researchers and practitioners interested in shaping agent behavior through human feedback, offering a novel approach to complex control problems.
How It Works
The system comprises a reward predictor trained on human preferences, which is then integrated into standard RL algorithms (TRPO, PPO). Human feedback is collected via a web application that presents pairs of trajectory segments for comparison. This approach allows agents to learn nuanced behaviors that are difficult to define with traditional reward functions.
Quick Start & Requirements
pip install -e .
and related sub-packages.Highlighted Details
Maintenance & Community
The project acknowledges contributions from Paul Christiano, Dario Amodei, Max Harms (@raelifin), Catherine Olsson (@catherio), and Kevin Frans (@kvfrans). An Atari extension is noted.
Licensing & Compatibility
The repository's licensing is not explicitly stated in the README, which may pose compatibility issues for commercial or closed-source use.
Limitations & Caveats
The README specifies Python 3.5, which is outdated. Setup for headless video rendering on Linux requires manual installation of XDummy and other dependencies. The project relies on Google Cloud Storage for media storage, necessitating cloud setup.
2 years ago
1 day