Super-mario-bros-PPO-pytorch  by vietnh1009

PPO agent for Super Mario Bros

created 5 years ago
1,216 stars

Top 32.9% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a PyTorch implementation of the Proximal Policy Optimization (PPO) algorithm, specifically tailored for training an agent to play Super Mario Bros. It aims to achieve higher performance than previous A3C implementations, with the goal of completing a significant majority of the game's levels. The target audience includes researchers and developers interested in reinforcement learning, particularly those exploring policy gradient methods for game environments.

How It Works

The project leverages the PPO algorithm, a policy gradient method known for its stability and sample efficiency, as described in the OpenAI paper "Proximal Policy Optimization Algorithms." This approach balances exploration and exploitation by constraining policy updates, preventing drastic changes that could destabilize training. The implementation is designed to be modular, allowing for training and testing of agents across different Super Mario Bros. levels.

Quick Start & Requirements

  • Install/Run: python train.py --world <world_num> --stage <stage_num> or python test.py --world <world_num> --stage <stage_num>
  • Prerequisites: PyTorch, Python. GPU recommended for training.
  • Docker: A Dockerfile is provided for environment setup.
  • Docs: [PYTORCH] Proximal Policy Optimization (PPO) for playing Super Mario Bros

Highlighted Details

  • Trained agent reportedly completes 31/32 Super Mario Bros. levels.
  • PPO is the algorithm used by OpenAI Five for Dota 2.
  • Offers flexibility in training by allowing specification of world and stage.
  • Docker support included for easier environment management.

Maintenance & Community

No specific information on contributors, sponsorships, or community channels (like Discord/Slack) is provided in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The agent has not been able to solve level 8-4 due to its puzzle-like nature requiring specific path choices. A known bug exists with rendering when using Docker, requiring env.render() to be commented out, which disables visualization during training/testing.

Health Check
Last commit

4 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
65 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Mckay Wrigley Mckay Wrigley(Founder of Takeoff AI), and
1 more.

street-fighter-ai by linyiLYi

0.1%
6k
AI agent for Street Fighter II using deep reinforcement learning
created 2 years ago
updated 1 year ago
Feedback? Help us improve.