PyTorch implementation of A3C reinforcement learning algorithm
Top 31.7% on sourcepulse
This repository provides a PyTorch implementation of the Asynchronous Advantage Actor-Critic (A3C) algorithm, as described in the paper "Asynchronous Methods for Deep Reinforcement Learning." It is designed for researchers and practitioners in reinforcement learning who want to experiment with or deploy A3C, offering a shared optimizer statistics approach for improved performance.
How It Works
The implementation leverages PyTorch for building the neural network models and managing computations. It utilizes an asynchronous approach where multiple worker processes interact with the environment concurrently, collecting experiences and updating a shared global network. This asynchronous nature allows for faster learning by reducing correlation between samples and enabling parallel exploration.
Quick Start & Requirements
python3 main.py --env-name "PongDeterministic-v4" --num-processes 16
PongDeterministic-v4
, BreakoutDeterministic-v4
).Highlighted Details
PongDeterministic-v4
in approximately 15 minutes with 16 processes.BreakoutDeterministic-v4
training requires several hours.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The README suggests that A2C, PPO, and ACKTR may offer better performance than A3C, implying A3C might not be the state-of-the-art choice for all tasks. Training on more complex environments like BreakoutDeterministic-v4
can be time-consuming.
5 years ago
1 week