imitation-learning  by Kaixhin

Imitation learning algorithms research paper (SAC base)

created 5 years ago
543 stars

Top 59.5% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a pragmatic implementation of various deep imitation learning algorithms, primarily built upon the Soft Actor-Critic (SAC) framework. It targets researchers and practitioners in reinforcement learning and robotics who need to compare and experiment with different imitation learning techniques for learning policies from expert demonstrations. The library offers a flexible and modular approach to evaluating algorithms like GAIL, DRIL, AdRIL, and others, facilitating reproducible research.

How It Works

The core of the library leverages the Soft Actor-Critic (SAC) off-policy reinforcement learning algorithm as a foundation. Various imitation learning methods are integrated by modifying the reward signal or incorporating discriminator networks. For instance, GAIL uses a discriminator to distinguish between agent and expert trajectories, while DRIL employs a disagreement-regularized discriminator. The implementation supports features like state-only imitation, absorbing state indicators, and mixing expert data with agent data for improved sample efficiency.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Prerequisites: PyTorch, OpenAI Gym, D4RL, Hydra. Ax and Hydra Ax Sweeper plugin are required for hyperparameter optimization.
  • Run: python train.py algorithm=<ALG> env=<ENV> (e.g., python train.py algorithm=GAIL env=hopper)
  • Documentation: [Not explicitly linked, but configuration files and script structure provide guidance.]

Highlighted Details

  • Implements AdRIL, DRIL (dropout version), GAIL, GMMIL, PWIL, RED, and BC pretraining.
  • Supports various discriminator options, reward shaping (AIRL), gradient penalty, and spectral normalization.
  • Benchmarked on Gym MuJoCo environments with D4RL "expert-v2" datasets.
  • Includes utilities for hyperparameter sweeps and Bayesian optimization.

Maintenance & Community

The project acknowledges contributions from Kai Arulkumaran and Dan Ogawa Lillrank, with citations to relevant research papers and GitHub repositories. There is no explicit mention of active community channels like Discord or Slack, nor a public roadmap.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. This requires further investigation for commercial use or integration into closed-source projects.

Limitations & Caveats

The README mentions that v1.0 contained on-policy algorithms, implying v2.0 (the current version) focuses on off-policy methods. The absence of an explicit license is a significant caveat for adoption. The project's primary focus is on MuJoCo environments, and compatibility with other simulation platforms may require modifications.

Health Check
Last commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
23 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.