MadMario  by yfeng997

Interactive Reinforcement Learning tutorial for game AI

Created 6 years ago
251 stars

Top 99.9% on SourcePulse

GitHubView on GitHub
Project Summary

MadMario provides an interactive PyTorch tutorial for building an AI-powered Mario agent, specifically targeting first-time Reinforcement Learning (RL) learners. It demystifies RL implementation by guiding users through the process of creating a learning agent using Double Q-learning and a Convolutional Neural Network (CNN), offering a practical and educational experience.

How It Works

The project utilizes Double Q-learning to address value overestimation, promoting more stable agent training. A Convolutional Neural Network (CNN) acts as the function approximator, processing game state observations to estimate Q-values. Environment preprocessing, including image resizing and color space conversion, is managed through dedicated wrappers to prepare data for the neural network.

Quick Start & Requirements

  • Installation: Install dependencies using Conda:
    conda env create -f environment.yml
    conda activate myenv
    
  • Prerequisites: Conda, Python. A GPU is highly recommended for training.
  • Running Training: python main.py
  • Running Evaluation: python replay.py
  • Training Time: Approximately 80 hours on CPU, 20 hours on GPU.
  • Resources: An interactive tutorial notebook (tutorial.ipynb) is available and can be run on Google Colab. A pre-trained checkpoint is provided: https://drive.google.com/file/d/1RRwhSMUrpBBRyAsfHLPGt1rlYFoiuus2/view?usp=sharing

Highlighted Details

  • Features an interactive tutorial notebook (tutorial.ipynb) with extensive explanations, runnable on Google Colab.
  • Implements Double Q-learning, a robust algorithm for deep reinforcement learning.
  • Logs key training and evaluation metrics including Episode, Step, Epsilon, MeanReward, MeanLength, MeanLoss, and MeanQValue.
  • Provides a pre-trained checkpoint for immediate evaluation.

Maintenance & Community

No specific information regarding maintainers, community channels (like Discord/Slack), or project roadmap is present in the provided README.

Licensing & Compatibility

The README does not explicitly state a software license.

Limitations & Caveats

Training can be time-consuming, requiring up to 80 hours on CPU, though GPU acceleration significantly reduces this. The project is positioned as a tutorial for first-time RL learners, implying it may not cover advanced RL techniques or optimizations.

Health Check
Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Evan Hubinger Evan Hubinger(Head of Alignment Stress-Testing at Anthropic), Junxiao Song Junxiao Song(Research Scientist at DeepSeek), and
2 more.

AgentNet by yandexdataschool

0%
300
Deep reinforcement learning library
Created 10 years ago
Updated 8 years ago
Feedback? Help us improve.