vime by openai

RL research paper code

Created 9 years ago

348 stars

Top 80.2% on SourcePulse

View on GitHub

5 Experts Love This Project

Aravind Srinivas

Cofounder of Perplexity

Evan Hubinger

Head of Alignment Stress-Testing at Anthropic

Deepak Pathak

Cofounder of Skild AI; Professor at CMU

Joshua Achiam

Head of Mission Alignment at OpenAI

and 1 more!

Project Summary

This repository provides the code for the paper "Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks." It enables researchers and practitioners to reproduce experiments on curiosity-driven exploration in deep reinforcement learning, specifically using the VIME algorithm.

How It Works

VIME (Variational Information Maximizing Exploration) leverages Bayesian Neural Networks to quantify intrinsic curiosity. By modeling uncertainty in predictions, it encourages exploration of novel states and actions, aiming to improve sample efficiency and performance in complex reinforcement learning tasks.

Quick Start & Requirements

Install: Clone the repository and add it as a submodule to an existing rllab installation.
Prerequisites: Requires rllab and Mujoco v1.31.
Execution: Run experiments via python sandbox/vime/experiments/run_trpo_expl.py.
Documentation: http://arxiv.org/abs/1605.09674

Highlighted Details

Implements the VIME algorithm for curiosity-driven exploration.
Focuses on reproducing results from the associated research paper.
Includes code for TRPO+VIME on the hierarchical SwimmerGather environment.

Maintenance & Community

This project is archived and no longer actively maintained or updated.

Licensing & Compatibility

The repository does not explicitly state a license.

Limitations & Caveats

The code is provided as-is and is archived, meaning no further updates or support are expected. Compatibility is strictly tied to the specified rllab and Mujoco v1.31 versions.

Health Check

Last Commit

7 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days