PILCO  by nrontsis

TensorFlow v2 implementation of the PILCO algorithm

created 7 years ago
331 stars

Top 83.8% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a modern TensorFlow 2 implementation of the Probabilistic Inference for Learning Control (PILCO) algorithm, targeting researchers and practitioners in Bayesian Reinforcement Learning. It offers a clean, GPU-scalable approach to learning control policies by leveraging Gaussian Processes for probabilistic modeling.

How It Works

PILCO utilizes Gaussian Processes (GPs) for regression, enabling it to model system dynamics and uncertainty. This probabilistic approach allows for sample-efficient learning and robust control, especially in scenarios with limited data. The implementation leverages TensorFlow 2 for automatic differentiation and GPU acceleration, and GPflow 2 for GP regression, offering a significant advantage over older, MATLAB-based implementations by improving scalability and ease of integration.

Quick Start & Requirements

  • Install via git clone https://github.com/nrontsis/PILCO && cd PILCO && python setup.py develop.
  • Requires Python >= 3.7.
  • Examples depend on OpenAI gym 0.15.3 and mujoco-py 2.0.2.7, which must be installed manually.
  • Example usage: python examples/inverted_pendulum.py.

Highlighted Details

  • TensorFlow 2 and GPflow 2 implementation for GPU scalability and modern ML practices.
  • Core functionality tested against original MATLAB implementation.
  • Includes an extension for "Safe PILCO" incorporating state-space safety constraints.

Maintenance & Community

  • Developed by Nikitas Rontsis and Kyriakos Polymenakos.
  • References to relevant publications are provided.

Licensing & Compatibility

  • The README does not explicitly state a license.

Limitations & Caveats

The project requires specific, older versions of OpenAI gym and mujoco-py, which may pose installation challenges or compatibility issues with newer environments. The absence of an explicit license could impact commercial use or integration into closed-source projects.

Health Check
Last commit

4 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
9 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.