DeepLearningVideoGames  by nikitasrivatsan

Deep Q learning research paper for video game strategy

created 9 years ago
1,091 stars

Top 35.5% on sourcepulse

GitHubView on GitHub
Project Summary

This project implements Deep Q-Networks (DQN) to enable AI agents to learn strategies for playing video games like Pong and Tetris directly from raw pixel input. It targets researchers and developers interested in applying reinforcement learning to complex visual environments without prior game knowledge. The primary benefit is demonstrating human-level performance in Pong, showcasing the power and generalizability of deep learning for control tasks.

How It Works

The project utilizes a deep convolutional neural network (CNN) to approximate the action-value (Q) function. This Q-function estimates the expected future reward for taking a specific action in a given game state. The CNN processes raw pixel data, preprocessed into grayscale, resized, and stacked frames, to learn relevant features. Training employs Q-learning with experience replay and target networks, sampling minibatches from a memory of past transitions to stabilize learning and improve data efficiency.

Quick Start & Requirements

  • Install: Requires TensorFlow.
  • Prerequisites: Amazon Web Services G2 large instance (GPU recommended for efficient training).
  • Setup: Initial population of replay memory takes 50,000 time steps; linear annealing of epsilon over 500,000 frames. Training for Pong achieved good results after ~1.38 million time steps (~25 hours).
  • Links: Videos of DQN in action, Visualization of convolutional layers and Q function

Highlighted Details

  • Achieved better-than-human performance in Pong.
  • CNN architecture includes 8x8, 4x4, and 3x3 convolutional layers with max pooling, followed by fully connected layers.
  • Uses Adam optimizer with a learning rate of 0.000001.
  • Replay memory size of 500,000 observations.

Maintenance & Community

  • Based on the seminal work by Mnih et al. (2015).
  • Project appears to be a research demonstration rather than an actively maintained library.

Licensing & Compatibility

  • The README does not explicitly state a license. The underlying research paper is published in Nature.

Limitations & Caveats

  • Tetris implementation is still under development.
  • Max pooling might discard useful information; further parameter tuning is suggested.
  • Convergence speed may vary significantly based on reward frequency in different game genres.
Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.