PyTorch-ActorCriticRL by vy007vikas

PyTorch implementation of DDPG for continuous RL

Created 8 years ago

420 stars

Top 69.9% on SourcePulse

View on GitHub

1 Expert Loves This Project

Deshraj Yadav

Cofounder of Mem0

Project Summary

This repository provides a PyTorch implementation of the Deep Deterministic Policy Gradient (DDPG) algorithm for continuous action reinforcement learning problems. It is targeted at researchers and practitioners interested in applying actor-critic methods to control tasks with continuous action spaces. The implementation aims to offer a clear and functional DDPG agent.

How It Works

The core of the implementation is the DDPG algorithm, which uses a deterministic policy for action selection and an Ornstein-Uhlenbeck process for exploration in continuous action spaces. It features separate actor and critic networks, both implemented as 3-layer neural networks. The actor network takes the state as input and outputs an action, while the critic network takes both state and action as input and outputs the state-action value function. Optimization involves minimizing the negative Q-value for the actor and a temporal difference error for the critic, with soft updates applied to target networks to improve stability.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites: PyTorch, OpenAI Gym.
Demo: Links to videos of performance on Pendulum-v0 and BipedalWalker-v2 are provided.

Highlighted Details

Implements DDPG for continuous action spaces.
Utilizes Ornstein-Uhlenbeck process for exploration.
Employs soft updates for target actor and critic networks.
Demonstrates performance on OpenAI Gym environments (Pendulum-v0, BipedalWalker-v2).

Maintenance & Community

No specific information on contributors, sponsorships, or community channels is present in the README.

Licensing & Compatibility

The README does not explicitly state a license.

Limitations & Caveats

The README does not detail any specific limitations, known bugs, or deprecation status. The implementation is presented as a functional example rather than a production-ready library.

Health Check

Last Commit

4 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days