Reinforcement-Implementation by zhangchuheng123

PyTorch implementations of RL algorithms

Created 7 years ago

472 stars

Top 64.5% on SourcePulse

Project Summary

This repository provides PyTorch implementations of benchmark model-free reinforcement learning algorithms for continuous action domains, primarily targeting MuJoCo environments. It's designed for researchers and practitioners seeking a straightforward, modular codebase to reproduce RL algorithm results and test new ideas quickly.

How It Works

The project focuses on a simple, modular implementation style, with each algorithm in a separate file. It aims to closely follow original research papers to reproduce reported results, making it easier to understand and extend. The current implementations cover several key algorithms in the continuous action space, with plans to expand to discrete action spaces and more complex techniques.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites: PyTorch, MuJoCo environment.
Demo: See PPO paper for performance comparison.

Highlighted Details

Implements A2C, ACER (with noted issues), CEM, TRPO, PPO, and Vanilla PG.
Includes high-quality Rainbow and DQN implementations for Atari and simpler environments.
Aims to match or exceed performance of established libraries like OpenAI Baselines.

Maintenance & Community

The project is actively developed by zhangchuheng123, with welcome bug reports and contributions.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking is therefore undetermined.

Limitations & Caveats

The ACER implementation is noted to have potential issues. The method for counting rewards may underestimate actual performance. The project is primarily focused on continuous action domains, with discrete action support being a secondary development goal.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days