Discover and explore top open-source AI tools and projects—updated daily.
ericyangyuPyTorch PPO implementation for beginners
Top 32.7% on SourcePulse
This repository provides a simplified, well-documented PyTorch implementation of Proximal Policy Optimization (PPO), designed for beginners in Reinforcement Learning. It aims to demystify PPO by offering a bare-bones, easy-to-follow codebase, directly correlating with a Medium article series for theoretical grounding.
How It Works
The implementation follows the pseudocode from OpenAI's Spinning Up, focusing on clarity and structure. It utilizes a feed-forward neural network for actor and critic policies and is designed for continuous observation and action spaces, though adaptable for discrete spaces. The core logic resides in ppo.py, with main.py orchestrating environment initialization, model training, and testing.
Quick Start & Requirements
pip install -r requirements.txtpython main.pypython main.py --mode test --actor_model ppo_actor.pthBox for observation and action spaces.python -m venv venv, source venv/bin/activate).Highlighted Details
Maintenance & Community
The project is authored by Eric Yu. Contact information (Email, LinkedIn) is provided for questions.
Licensing & Compatibility
The repository does not explicitly state a license in the provided README.
Limitations & Caveats
The implementation is primarily designed for continuous observation and action spaces. Generating all data for the Medium article's graphs takes approximately 10 hours on a standard computer.
1 year ago
Inactive
KhoomeiK
Omegastick
jingweiz
ikostrikov