Popular-RL-Algorithms by quantumiracle

PyTorch implementations of model-free RL algorithms

Created 6 years ago

1,324 stars

Top 30.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Benjamin Bolte

Cofounder of K-Scale Labs

Project Summary

This repository provides PyTorch implementations of numerous popular model-free reinforcement learning algorithms, targeting researchers and students studying RL. It offers a comprehensive collection for comparing and understanding various RL approaches, including SAC, TD3, PPO, and Q-learning, across standard Gym environments and custom implementations.

How It Works

The project implements a wide array of RL algorithms, including actor-critic methods, Q-learning variants, and more advanced techniques like Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO). It features multiple implementation versions for several algorithms (e.g., SAC, PPO) to facilitate comparison and study, often referencing original papers for theoretical grounding. The code is primarily in PyTorch, with mentions of TensorFlow 2.0 implementations in linked external repositories.

Quick Start & Requirements

Install dependencies via pip install -r requirements.txt.
Requires gym==0.7 or gym==0.10 due to compatibility issues with newer versions.
Usage examples: python ***.py --train or python ***.py --test.

Highlighted Details

Implements over a dozen RL algorithms including SAC (multiple versions), TD3, PPO (continuous/discrete, multiprocessing), DDPG, Q-learning, SARSA, and QMIX.
Includes implementations for Recurrent Policy Gradient (LSTM/GRU), Soft Decision Trees for explainable RL, and Probabilistic Mixture-of-Experts.
Offers insights into "undervalued tricks" for RL implementation, such as reward preprocessing, normalization strategies, and multiprocessing considerations.
Provides performance comparisons for SAC and TD3 on Pendulum-v0, and notes AC/A2C's limitations in continuous control.

Maintenance & Community

This repository appears to be a personal collection rather than an actively maintained library. The author invites discussions on implementations. Links to related projects like MARS (WIP for multi-agent RL) and external resources like a Deep Reinforcement Learning book are provided.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Given its nature as a personal collection, commercial use or integration into closed-source projects may require clarification from the author.

Limitations & Caveats

The code is described as not heavily cleaned or structured, with multiple versions of algorithms present. Some implementations, like PPG, MPO, and AWR, are marked as "todo." Multiprocessing implementations might be unsafe due to potential race conditions without explicit locks. Compatibility with newer Gym versions is explicitly stated as problematic.

Health Check

Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

8 stars in the last 30 days