MadMario by yfeng997

Interactive Reinforcement Learning tutorial for game AI

Created 6 years ago

255 stars

Top 98.9% on SourcePulse

Project Summary

MadMario provides an interactive PyTorch tutorial for building an AI-powered Mario agent, specifically targeting first-time Reinforcement Learning (RL) learners. It demystifies RL implementation by guiding users through the process of creating a learning agent using Double Q-learning and a Convolutional Neural Network (CNN), offering a practical and educational experience.

How It Works

The project utilizes Double Q-learning to address value overestimation, promoting more stable agent training. A Convolutional Neural Network (CNN) acts as the function approximator, processing game state observations to estimate Q-values. Environment preprocessing, including image resizing and color space conversion, is managed through dedicated wrappers to prepare data for the neural network.

Quick Start & Requirements

Installation: Install dependencies using Conda:

conda env create -f environment.yml
conda activate myenv

Prerequisites: Conda, Python. A GPU is highly recommended for training.
Running Training: python main.py
Running Evaluation: python replay.py
Training Time: Approximately 80 hours on CPU, 20 hours on GPU.
Resources: An interactive tutorial notebook (tutorial.ipynb) is available and can be run on Google Colab. A pre-trained checkpoint is provided: https://drive.google.com/file/d/1RRwhSMUrpBBRyAsfHLPGt1rlYFoiuus2/view?usp=sharing

Highlighted Details

Features an interactive tutorial notebook (tutorial.ipynb) with extensive explanations, runnable on Google Colab.
Implements Double Q-learning, a robust algorithm for deep reinforcement learning.
Logs key training and evaluation metrics including Episode, Step, Epsilon, MeanReward, MeanLength, MeanLoss, and MeanQValue.
Provides a pre-trained checkpoint for immediate evaluation.

Maintenance & Community

No specific information regarding maintainers, community channels (like Discord/Slack), or project roadmap is present in the provided README.

Licensing & Compatibility

The README does not explicitly state a software license.

Limitations & Caveats

Training can be time-consuming, requiring up to 80 hours on CPU, though GPU acceleration significantly reduces this. The project is positioned as a tutorial for first-time RL learners, implying it may not cover advanced RL techniques or optimizations.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days