TD3  by sfujim

PyTorch implementation of TD3 for OpenAI gym tasks

created 7 years ago
1,917 stars

Top 23.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a PyTorch implementation of Twin Delayed Deep Deterministic Policy Gradients (TD3), an actor-critic algorithm designed to address function approximation errors in reinforcement learning. It is targeted at researchers and practitioners working with continuous control tasks, offering a robust baseline for benchmarking and experimentation.

How It Works

TD3 improves upon DDPG by introducing several key modifications: delayed policy updates, target policy smoothing, and clipped double Q-learning. These techniques collectively reduce the overestimation bias in Q-value estimates, leading to more stable and effective policy learning in complex environments. The implementation is built using PyTorch, leveraging its automatic differentiation and GPU acceleration capabilities.

Quick Start & Requirements

  • Primary install / run command: ./run_experiments.sh or python main.py --env HalfCheetah-v2
  • Prerequisites: PyTorch 1.2, Python 3.7, MuJoCo, OpenAI Gym.
  • Links: Learning Curves, Video

Highlighted Details

  • Implements TD3, DDPG, and includes scripts for reproducing paper results.
  • Tested on MuJoCo continuous control tasks.
  • Learning curves are provided as NumPy arrays, representing average rewards over 1 million time steps.

Maintenance & Community

  • The code is no longer exactly representative of the implementation used in the paper due to minor adjustments for performance.
  • Bibtex citation provided for the original paper.

Licensing & Compatibility

  • The repository does not explicitly state a license.

Limitations & Caveats

The code is noted as being slightly different from the version used to generate the paper's results, with minor hyperparameter adjustments made for improved performance.

Health Check
Last commit

2 years ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
64 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.