batch-ppo  by google-research

TensorFlow infrastructure for batched reinforcement learning

created 8 years ago
968 stars

Top 38.9% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides an optimized infrastructure for reinforcement learning agents implemented in TensorFlow, specifically targeting efficient batched computation across multiple parallel environments. It offers an implementation of Proximal Policy Optimization (PPO) as a starting point for researchers and practitioners looking to build and experiment with RL algorithms.

How It Works

The core innovation lies in its batched environment interface, which integrates seamlessly with TensorFlow. It utilizes agents.tools.wrappers.ExternalProcess to run Gym environments in separate processes, bypassing Python's GIL for true parallelism. agents.tools.BatchEnv then aggregates these parallel environments, accepting batched actions and returning batched results. agents.tools.InGraphBatchEnv further integrates this into the TensorFlow graph, exposing environment steps as operations. Finally, agents.tools.simulate() fuses environment stepping and agent updates into a single TensorFlow operation for efficient training loops.

Quick Start & Requirements

  • Install: Clone the repository.
  • Run: python3 -m agents.scripts.train --logdir=/path/to/logdir --config=pendulum
  • Prerequisites: Python 2/3, TensorFlow 1.3+, Gym, ruamel.yaml.
  • Visualization: tensorboard --logdir=/path/to/logdir --port=2222
  • Rendering/Stats: python3 -m agents.scripts.visualize --logdir=/path/to/logdir/<time>-<config> --outdir=/path/to/outdir/
  • Docs: TensorFlow Agents paper (cited for code usage).

Highlighted Details

  • Optimized infrastructure for batched reinforcement learning in TensorFlow.
  • Efficient parallel environment execution using external processes.
  • Integrated TensorFlow graph operations for environment stepping.
  • Single-operation fusion of environment steps and agent updates.

Maintenance & Community

  • For questions, open an issue on GitHub.

Licensing & Compatibility

  • License: Not explicitly stated in the README, but the project is from Google Research. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project requires TensorFlow 1.3+, which is significantly outdated. The README mentions Python 2/3 compatibility, but modern usage would likely focus on Python 3. No explicit mention of GPU support or CUDA requirements is made, though TensorFlow typically benefits from them.

Health Check
Last commit

6 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Starred by Peter Norvig Peter Norvig(Author of Artificial Intelligence: A Modern Approach; Research Director at Google), Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), and
45 more.

tensorflow by tensorflow

0.1%
191k
Open-source ML framework
created 9 years ago
updated 13 hours ago
Feedback? Help us improve.