safety-starter-agents by openai

RL algorithms for safe exploration research paper

Created 6 years ago

448 stars

Top 67.0% on SourcePulse

View on GitHub

2 Experts Love This Project

Vincent Weisser

Cofounder of Prime Intellect

Evan Hubinger

Head of Alignment Stress-Testing at Anthropic

Project Summary

This repository provides implementations of constrained and unconstrained Reinforcement Learning (RL) algorithms, specifically PPO, TRPO, PPO-Lagrangian, TRPO-Lagrangian, and CPO. It serves as a companion to the paper "Benchmarking Safe Exploration in Deep Reinforcement Learning" and is intended for researchers and practitioners in safe RL.

How It Works

The agents are implemented using a PPO variant that differs from common implementations like Baselines, omitting observation/reward normalization and clipped value loss, but including an early stopping trick. This approach prioritizes straightforward comparison between the included algorithms within the context of the paper's experiments, rather than maximizing sample efficiency for any single algorithm.

Quick Start & Requirements

Install via pip install -e . after cloning the repository.
Requires Python 3.6+.
Tested on Mac OS Mojave and Ubuntu 16.04 LTS.
Note: Does not include Safety Gym; it must be installed separately.

Highlighted Details

Implements algorithms used in the "Benchmarking Safe Exploration in Deep Reinforcement Learning" paper.
Includes experimental implementations of SAC and SAC-Lagrangian.
Provides scripts for reproducing paper experiments, plotting results, and testing trained policies.

Maintenance & Community

Status: Archived (code provided as-is, no updates expected).
Developed by OpenAI.

Licensing & Compatibility

License: Not explicitly stated in the README.
Compatibility: Intended for research use; compatibility with commercial or closed-source projects is not specified.

Limitations & Caveats

The repository is archived, meaning no further updates or support are expected. The PPO implementation is not optimized for maximum sample efficiency compared to other common implementations. Reproducing results may not be perfectly deterministic across different machines.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days