random-network-distillation  by openai

RL research paper code

created 6 years ago
907 stars

Top 40.9% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the code for the paper "Exploration by Random Network Distillation" (RND). It enables reinforcement learning agents to explore novel states in environments by rewarding them for encountering states that their internal "random network" predicts poorly. This is particularly beneficial for sparse-reward environments like Montezuma's Revenge.

How It Works

The core of RND involves two neural networks: a fixed, randomly initialized target network and a predictor network. The predictor network is trained to mimic the output of the target network for states encountered by the agent. The difference between the target and predictor network outputs serves as an intrinsic reward signal, encouraging the agent to visit states where the predictor network is less accurate, thus driving exploration.

Quick Start & Requirements

  • Primary install / run command: python run_atari.py --gamma_ext 0.999
  • Prerequisites: Python, MPI (for multi-GPU/multi-machine training), Atari environments.
  • To train on 1024 parallel environments using 8 GPUs: mpiexec -n 8 python run_atari.py --num_env 128 --gamma_ext 0.999
  • Blog post and videos are available.

Highlighted Details

  • Implements the Random Network Distillation (RND) algorithm for intrinsic motivation.
  • Designed for Atari environments, with a focus on sparse-reward tasks like Montezuma's Revenge.
  • Supports distributed training via MPI for scaling across multiple GPUs and machines.

Maintenance & Community

  • Status: Archive (code is provided as-is, no updates expected).
  • Notable contributors: Yuri Burda, Harri Edwards, Amos Storkey, Oleg Klimov.

Licensing & Compatibility

  • License: Not explicitly stated in the provided README. Compatibility for commercial use or closed-source linking is undetermined.

Limitations & Caveats

The project is archived and will not receive further updates. The license is not specified, which may pose a barrier to commercial adoption or integration into closed-source projects.

Health Check
Last commit

4 years ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
16 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.