PPO-for-Beginners by ericyangyu

PyTorch PPO implementation for beginners

Created 5 years ago

1,192 stars

Top 32.7% on SourcePulse

Project Summary

This repository provides a simplified, well-documented PyTorch implementation of Proximal Policy Optimization (PPO), designed for beginners in Reinforcement Learning. It aims to demystify PPO by offering a bare-bones, easy-to-follow codebase, directly correlating with a Medium article series for theoretical grounding.

How It Works

The implementation follows the pseudocode from OpenAI's Spinning Up, focusing on clarity and structure. It utilizes a feed-forward neural network for actor and critic policies and is designed for continuous observation and action spaces, though adaptable for discrete spaces. The core logic resides in ppo.py, with main.py orchestrating environment initialization, model training, and testing.

Quick Start & Requirements

Install: pip install -r requirements.txt
Run training: python main.py
Run testing: python main.py --mode test --actor_model ppo_actor.pth
Prerequisites: Python, PyTorch. Environments require Box for observation and action spaces.
Setup: Create a virtual environment (python -m venv venv, source venv/bin/activate).
Additional Resources: Medium Article Series, Spinning Up PPO.

Highlighted Details

Directly correlates code with a step-by-step Medium tutorial series.
Includes code for data collection and graph generation, with pre-existing data available.
Offers detailed comments and structure for pedagogical purposes.
Follows pseudocode from OpenAI's Spinning Up for PPO.

Maintenance & Community

The project is authored by Eric Yu. Contact information (Email, LinkedIn) is provided for questions.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README.

Limitations & Caveats

The implementation is primarily designed for continuous observation and action spaces. Generating all data for the Medium article's graphs takes approximately 10 hours on a standard computer.

PPO-for-Beginners by ericyangyu

Explore Similar Projects

pytorch-rl by navneet-nmk

LlamaGym by KhoomeiK

rad by MishaLaskin

pg_travel by reinforcement-learning-kr

hindsight-experience-replay by TianhongDai

pytorch-cpp-rl by Omegastick

lets-do-irl by reinforcement-learning-kr

pytorch-rl by jingweiz

Reinforcement_Learning by pythonlessons

pfrl by pfnet

chainerrl by chainer

pytorch-a3c by ikostrikov