batch-ppo  by google-research

TensorFlow infrastructure for batched reinforcement learning

Created 8 years ago
969 stars

Top 38.0% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an optimized infrastructure for reinforcement learning agents implemented in TensorFlow, specifically targeting efficient batched computation across multiple parallel environments. It offers an implementation of Proximal Policy Optimization (PPO) as a starting point for researchers and practitioners looking to build and experiment with RL algorithms.

How It Works

The core innovation lies in its batched environment interface, which integrates seamlessly with TensorFlow. It utilizes agents.tools.wrappers.ExternalProcess to run Gym environments in separate processes, bypassing Python's GIL for true parallelism. agents.tools.BatchEnv then aggregates these parallel environments, accepting batched actions and returning batched results. agents.tools.InGraphBatchEnv further integrates this into the TensorFlow graph, exposing environment steps as operations. Finally, agents.tools.simulate() fuses environment stepping and agent updates into a single TensorFlow operation for efficient training loops.

Quick Start & Requirements

  • Install: Clone the repository.
  • Run: python3 -m agents.scripts.train --logdir=/path/to/logdir --config=pendulum
  • Prerequisites: Python 2/3, TensorFlow 1.3+, Gym, ruamel.yaml.
  • Visualization: tensorboard --logdir=/path/to/logdir --port=2222
  • Rendering/Stats: python3 -m agents.scripts.visualize --logdir=/path/to/logdir/<time>-<config> --outdir=/path/to/outdir/
  • Docs: TensorFlow Agents paper (cited for code usage).

Highlighted Details

  • Optimized infrastructure for batched reinforcement learning in TensorFlow.
  • Efficient parallel environment execution using external processes.
  • Integrated TensorFlow graph operations for environment stepping.
  • Single-operation fusion of environment steps and agent updates.

Maintenance & Community

  • For questions, open an issue on GitHub.

Licensing & Compatibility

  • License: Not explicitly stated in the README, but the project is from Google Research. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project requires TensorFlow 1.3+, which is significantly outdated. The README mentions Python 2/3 compatibility, but modern usage would likely focus on Python 3. No explicit mention of GPU support or CUDA requirements is made, though TensorFlow typically benefits from them.

Health Check
Last Commit

6 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Hanlin Tang Hanlin Tang(CTO Neural Networks at Databricks; Cofounder of MosaicML), Amanpreet Singh Amanpreet Singh(Cofounder of Contextual AI), and
2 more.

coach by IntelLabs

0%
2k
Reinforcement learning framework for experimentation (discontinued)
Created 8 years ago
Updated 2 years ago
Feedback? Help us improve.