rl_a3c_pytorch by dgriff777

PyTorch implementation of A3C for Atari games

Created 8 years ago

570 stars

Top 56.5% on SourcePulse

View on GitHub

7 Experts Love This Project

Zhuohan Li

Coauthor of vLLM

Junxiao Song

Research Scientist at DeepSeek

Wu Yuxin

Cofounder of Moonshot AI

Jesse Clark

Cofounder of Marqo

and 3 more!

Project Summary

This repository provides an implementation of the Asynchronous Advantage Actor-Critic (A3C) algorithm in PyTorch, specifically tailored for Atari environments. It aims to significantly accelerate training times through a novel GPU-centric architecture called A3G, making it suitable for researchers and practitioners in reinforcement learning seeking faster experimentation and state-of-the-art performance on classic control tasks.

How It Works

The A3G architecture enhances A3C by assigning each agent its own network on a GPU, while the shared model resides on the CPU. Agents' models are rapidly converted to CPU for frequent, lock-free updates to the shared model, leveraging Hogwild! training principles. This asynchronous update mechanism drastically reduces training time, enabling convergence in minutes for some Atari games.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites: Python 2.7+, OpenAI Gym and Universe, PyTorch (note: PyTorch 2.0 may cause GPU memory issues; consider downgrading).
Training: python main.py --env PongNoFrameskip-v4 --workers 32
A3G Training (4x V100 GPUs): python main.py --env PongNoFrameskip-v4 --workers 32 --gpu-ids 0 1 2 3 --amsgrad
Evaluation: python gym_eval.py --env PongNoFrameskip-v4 --num-episodes 100
Docs: [a3c_continuous][23] (for continuous action spaces)

Highlighted Details

Achieves state-of-the-art scores on multiple OpenAI Gym Atari environments, including world-record performance on Space Invaders.
Demonstrates training times as low as 10 minutes for solving games like Pong and achieving high scores in Breakout.
Includes support for RMSProp and Adam optimizers with shared statistics, and an option for non-shared optimizers.
Features distributed step size training for further performance tuning.
Integrates TensorBoard for logging, graphing training progress, and visualizing model weights.

Maintenance & Community

The project appears to be a personal implementation by dgriff777, with no explicit mention of a broader community or ongoing maintenance beyond initial updates.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project relies on older Python 2.7+ and may have compatibility issues with newer PyTorch versions. The README mentions that OpenAI Gym's Atari settings are more challenging than standard ALE due to stochastic frame skipping and action repetition. Removed trained models to reduce repo size.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days