pytorch-es by atgambardella

PyTorch implementation of Evolution Strategies for Markov Decision Processes

Created 8 years ago

354 stars

Top 78.9% on SourcePulse

View on GitHub

4 Experts Love This Project

James Bradbury

Head of Compute at Anthropic

Project Summary

This repository provides a PyTorch implementation of Evolution Strategies (ES), a black-box optimization algorithm for training neural networks in reinforcement learning tasks. It is suitable for researchers and practitioners looking for an alternative to policy gradient methods, offering potential for efficient parallelization.

How It Works

The implementation leverages ES to optimize neural network parameters directly, bypassing traditional reinforcement learning techniques like policy gradients. It uses a population-based approach where multiple perturbed versions of a network are evaluated, and gradient updates are derived from the performance of these perturbations. The use of the SELU nonlinearity is noted as a computationally efficient alternative to virtual batch normalization.

Quick Start & Requirements

Install via pip install -r requirements.txt (after cloning).
Requires Python 3.5+, PyTorch >= 0.2.0, numpy, gym, universe, cv2.
Example usage: python3 main.py --small-net --env-name CartPole-v1
Official documentation or demo links are not explicitly provided in the README.

Highlighted Details

Implements both small networks for simple tasks and Convnet-LSTM for Atari games.
Offers options for testing restored checkpoints and rendering environments.
The author notes deviations from the original paper, including a different approach to reward passing between workers and not adaptively changing episode length.
Performance on Atari games reportedly increased with larger neural network sizes.

Maintenance & Community

Contributions via GitHub issues and pull requests are welcomed.
No specific community channels (Discord/Slack) or roadmap are mentioned.

Licensing & Compatibility

Licensed under the MIT License.
Permissive for commercial use and closed-source linking.

Limitations & Caveats

The implementation is based on PyTorch version 0.2.0, which is significantly outdated. The README also mentions an unsupported slow_version branch for managing threads, indicating potential stability or performance issues with certain configurations.

Health Check

Last Commit

8 years ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days