DDPG  by floodsung

DDPG implementation for continuous control tasks

Created 9 years ago
569 stars

Top 56.6% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides a Python reimplementation of the Deep Deterministic Policy Gradient (DDPG) algorithm, a popular deep reinforcement learning method for continuous control tasks. It is designed for researchers and practitioners working with OpenAI Gym environments and TensorFlow.

How It Works

The implementation leverages TensorFlow for building and training the actor and critic networks. It follows the DDPG paper's architecture, with a key detail being the successful application of Batch Normalization to the actor network, though its implementation on the critic network is noted as problematic.

Quick Start & Requirements

  • Primary install / run command:
    git clone https://github.com/songrotek/DDPG.git
    cd DDPG
    python gym_ddpg.py
    
  • Prerequisites: OpenAI Gym, TensorFlow.
  • To change environments, modify ENV_NAME in gym_ddpg.py. To change network architecture, adjust imports in ddpg.py.

Highlighted Details

  • Implements DDPG for continuous control tasks.
  • Utilizes OpenAI Gym for environment simulation.
  • Batch Normalization is successfully applied to the actor network.

Maintenance & Community

No specific information on contributors, sponsorships, or community channels is provided in the README.

Licensing & Compatibility

The README does not explicitly state a license.

Limitations & Caveats

Batch Normalization on the critic network is reported as problematic. Several Mujoco environments (InvertedPendulum, InvertedDoublePendulum, Hopper) are noted as unsolved within the context of this implementation.

Health Check
Last Commit

4 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Nathan Lambert Nathan Lambert(Research Scientist at AI2), Phil Wang Phil Wang(Prolific Research Paper Implementer), and
1 more.

TD3 by sfujim

0.3%
2k
PyTorch implementation of TD3 for OpenAI gym tasks
Created 7 years ago
Updated 2 years ago
Feedback? Help us improve.