universe-starter-agent by openai

Starter agent for solving Universe environments using A3C

Created 9 years ago

1,106 stars

Top 34.5% on SourcePulse

View on GitHub

6 Experts Love This Project

Aravind Srinivas

Cofounder of Perplexity

Amanpreet Singh

Cofounder of Contextual AI

and 2 more!

Project Summary

This repository provides a starter agent for solving various "universe" environments, primarily targeting researchers and developers in reinforcement learning. It implements a basic Actor-Critic algorithm (A3C) adapted for real-time, potentially high-latency environments, offering a foundational example for building more complex agents.

How It Works

The agent utilizes an asynchronous advantage actor-critic (A3C) algorithm, a popular reinforcement learning technique known for its efficiency in parallel training. It spawns multiple worker processes that interact with the environment and update a central parameter server. This asynchronous approach allows for continuous learning without waiting for individual workers to complete, making it suitable for real-time environments where latency is a significant factor.

Quick Start & Requirements

Install: conda create --name universe-starter-agent python=3.5, source activate universe-starter-agent, then install dependencies via pip and conda as detailed in the README.
Prerequisites: Python 2.7 or 3.5, Golang, six, TensorFlow 0.12, tmux, htop, cmake, libjpeg-turbo (or libjpeg-dev on Linux), gym[atari], universe, opencv-python, numpy, scipy.
Setup: Requires significant dependency installation and environment configuration.
Docs: Retro Contest Blog Post

Highlighted Details

Solves Atari Pong (PongDeterministic-v3) in under 30 minutes with 16 workers on an m4.10xlarge instance.
Demonstrates solving VNC environments, highlighting challenges and strategies for handling network latency.
Includes examples for training on Flash games like Neon Race, achieving 80% of maximal score within 1-2 hours with 16 workers.
Supports visualization of agent's actions via env.render().

Maintenance & Community

This repository has been deprecated in favor of the Retro library.

Licensing & Compatibility

The repository's license is not explicitly stated in the README.

Limitations & Caveats

The project is deprecated. Performance is highly sensitive to network latency, especially for real-time environments. The provided implementation is tuned for VNC Pong and may require significant adjustments for other tasks.

Health Check

Last Commit

7 years ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days