This repository provides a PyTorch implementation of popular Deep Reinforcement Learning algorithms, including policy gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). It is targeted at researchers and practitioners in RL who need a solid, well-implemented baseline for these algorithms. The key benefit is a fast, efficient implementation with support for both discrete and continuous action spaces.
How It Works
The implementation leverages PyTorch for its neural network components and offers multiprocessing for parallel environment interaction, significantly speeding up sample collection. A notable feature is the fast Fisher vector product calculation for TRPO, which is crucial for the algorithm's stability and performance. The code structure separates different algorithms and provides clear examples for running them.
Quick Start & Requirements
pip install -r requirements.txt
(after cloning).mujoco-py
, and gym
.export OMP_NUM_THREADS=1
.Highlighted Details
Maintenance & Community
The repository is a personal project by Khrylx. There are no explicit mentions of community channels, active development, or partnerships in the README.
Licensing & Compatibility
The README does not explicitly state a license. The code references openai/baselines
, which is MIT licensed, but this does not guarantee the license of this specific repository. Compatibility for commercial use is not specified.
Limitations & Caveats
The code is noted to work for PyTorch 0.4, with a separate branch for PyTorch 0.3, indicating potential compatibility issues with newer PyTorch versions. The project's maintenance status and community support are unclear from the provided README.
4 years ago
1 day