pytorch-REINFORCE by chingyaoc

PyTorch implementation of REINFORCE for control tasks

Created 8 years ago

267 stars

Top 96.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Soumith Chintala

Coauthor of PyTorch

Project Summary

This repository provides a PyTorch implementation of the REINFORCE algorithm, a foundational policy gradient method for reinforcement learning. It is designed for researchers and practitioners looking to experiment with or apply REINFORCE to both discrete and continuous control tasks, specifically within OpenAI Gym environments.

How It Works

The implementation utilizes a neural network to approximate the policy function. For discrete action spaces, it typically outputs probabilities for each action, while for continuous spaces, it outputs parameters (e.g., mean and standard deviation) of a distribution from which actions are sampled. The REINFORCE algorithm then updates the policy network's weights by backpropagating the discounted return, scaled by the log-probability of the taken actions.

Quick Start & Requirements

Primary install / run command: python main.py --env_name [name of environment]
Prerequisites: Python 2.7, PyTorch, OpenAI Gym. MuJoCo is optional.
Links: pytorch example

Highlighted Details

Supports both discrete and continuous action spaces.
Tested with OpenAI Gym environments like CartPole-v0 (discrete) and InvertedPendulum-v1 (continuous).

Maintenance & Community

No information on contributors, sponsorships, or community channels is available in the README.

Licensing & Compatibility

The README does not specify a license.

Limitations & Caveats

The project requires Python 2.7, which is end-of-life and may present compatibility issues with modern libraries. The lack of a specified license could restrict commercial use or integration into closed-source projects.

Health Check

Last Commit

8 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days