imitation by openai

Imitation learning research paper code

Created 9 years ago

729 stars

Top 47.4% on SourcePulse

View on GitHub

3 Experts Love This Project

Aravind Srinivas

Cofounder of Perplexity

Jiaming Song

Chief Scientist at Luma AI

Tom Brown

Cofounder of Anthropic

Project Summary

This repository provides code for Generative Adversarial Imitation Learning (GAIL), a method for learning policies from expert demonstrations. It is targeted at researchers and practitioners in reinforcement learning and robotics. The primary benefit is enabling agents to learn complex behaviors without explicit reward functions, relying solely on expert demonstrations.

How It Works

The implementation uses a Generative Adversarial Network (GAN) framework. A generator (the agent's policy) is trained to produce state-action sequences that are indistinguishable from those generated by an expert policy. A discriminator network learns to differentiate between expert and generated trajectories. This adversarial process drives the generator to mimic expert behavior. The underlying policy optimization is handled by Trust Region Policy Optimization (TRPO).

Quick Start & Requirements

Install dependencies: pip install gym[mujoco] numpy scipy h5py tables pandas matplotlib
Requires mujoco_py >= 0.4.0 and OpenAI Gym >= 0.1.0.
Theano is a required backend, which may present installation challenges on modern systems.

Highlighted Details

Implements Generative Adversarial Imitation Learning (GAIL).
Includes an implementation of Trust Region Policy Optimization (TRPO).
Provides expert policies trained via TRPO on true costs.
Organizes experiment specifications and evaluation results.

Maintenance & Community

This project is archived and no updates are expected.

Licensing & Compatibility

The repository does not explicitly state a license. Given its origin within OpenAI and the lack of a LICENSE file, users should assume it is not open-source for commercial use without explicit permission.

Limitations & Caveats

The project is archived, indicating no ongoing development or support. The dependency on Theano, an older deep learning framework, may make installation and compatibility with current hardware and software stacks difficult.

Health Check

Last Commit

7 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days