ddpg-aigym by stevenpjg

DDPG implementation for continuous control in OpenAI Gym

Created 9 years ago

276 stars

Top 93.9% on SourcePulse

View on GitHub

1 Expert Loves This Project

Michael Truell

Cofounder of Cursor

Project Summary

This repository provides a TensorFlow implementation of the Deep Deterministic Policy Gradient (DDPG) algorithm for continuous control tasks, targeting researchers and practitioners in reinforcement learning. It aims to offer a clear and functional DDPG agent for OpenAI Gym environments.

How It Works

The implementation follows the DDPG algorithm as described in Lillicrap et al. (arXiv:1509.02971). It utilizes an actor-critic architecture with separate networks for the policy and value functions. Key features include batch normalization for improved learning speed and a "grad-inverter" component, referencing arXiv:1511.04143.

Quick Start & Requirements

Install via git clone https://github.com/stevenpjg/ddpg-aigym.git and cd ddpg-aigym.
Run with python main.py.
Dependencies: TensorFlow (developed with 0.11.0rc0), OpenAI Gym, MuJoCo.

Highlighted Details

Implements DDPG algorithm from Lillicrap et al.
Includes Batch Normalization for faster learning.
Incorporates "grad-inverter" technique.
Supports customization of Gym environments and batch normalization via script variables.

Maintenance & Community

No information on maintainers, community channels, or roadmap is provided in the README.

Licensing & Compatibility

The README does not specify a license. Compatibility for commercial use or closed-source linking is undetermined.

Limitations & Caveats

The implementation is based on an older TensorFlow version (0.11.0rc0), which may present compatibility issues with modern TensorFlow or Python versions. The lack of explicit licensing information is a significant caveat for adoption.

Health Check

Last Commit

7 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days