stable-baselines by Stable-Baselines-Team

RL algorithm implementations, a fork of OpenAI Baselines (maintenance mode)

Created 7 years ago

306 stars

Top 87.9% on SourcePulse

View on GitHub

1 Expert Loves This Project

John Yang

Coauthor of SWE-bench, SWE-agent

Project Summary

This repository provides a fork of OpenAI Baselines, offering refactored and improved implementations of reinforcement learning algorithms. It targets researchers and practitioners seeking to easily replicate, refine, and benchmark RL algorithms, with a focus on usability and clear documentation.

How It Works

Stable-Baselines features a unified structure for all algorithms, adhering to PEP8 standards and offering extensive documentation and testing. It introduces additional algorithms like SAC and TD3, along with HER support for several algorithms, enhancing its capabilities beyond the original OpenAI Baselines.

Quick Start & Requirements

Install: pip install stable-baselines[mpi] (or pip install stable-baselines without MPI).
Prerequisites: Python 3.5+, CMake, OpenMPI, zlib1g-dev. MuJoCo requires a license.
Docs: https://stable-baselines.readthedocs.io/
RL Baselines Zoo: https://github.com/araffin/rl-baselines-zoo

Highlighted Details

Implements state-of-the-art RL methods including A2C, PPO2, SAC, TD3, and more.
Offers support for various action spaces (Box, Discrete, MultiDiscrete, MultiBinary).
Provides a scikit-learn-like API for ease of use.
Includes a collection of over 100 pre-trained RL agents in the RL Baselines Zoo.

Maintenance & Community

This package is in maintenance mode, with a recommendation to use Stable-Baselines3 (SB3). Key maintainers include Ashley Hill, Antonin Raffin, and others.

Licensing & Compatibility

The repository is available under the MIT license, permitting commercial use and linking with closed-source projects.

Limitations & Caveats

The package is in maintenance mode and does not support TensorFlow 2. Users are directed to Stable-Baselines3 for up-to-date implementations and support.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days