TensorFlow infrastructure for batched reinforcement learning
Top 38.9% on sourcepulse
This project provides an optimized infrastructure for reinforcement learning agents implemented in TensorFlow, specifically targeting efficient batched computation across multiple parallel environments. It offers an implementation of Proximal Policy Optimization (PPO) as a starting point for researchers and practitioners looking to build and experiment with RL algorithms.
How It Works
The core innovation lies in its batched environment interface, which integrates seamlessly with TensorFlow. It utilizes agents.tools.wrappers.ExternalProcess
to run Gym environments in separate processes, bypassing Python's GIL for true parallelism. agents.tools.BatchEnv
then aggregates these parallel environments, accepting batched actions and returning batched results. agents.tools.InGraphBatchEnv
further integrates this into the TensorFlow graph, exposing environment steps as operations. Finally, agents.tools.simulate()
fuses environment stepping and agent updates into a single TensorFlow operation for efficient training loops.
Quick Start & Requirements
python3 -m agents.scripts.train --logdir=/path/to/logdir --config=pendulum
tensorboard --logdir=/path/to/logdir --port=2222
python3 -m agents.scripts.visualize --logdir=/path/to/logdir/<time>-<config> --outdir=/path/to/outdir/
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project requires TensorFlow 1.3+, which is significantly outdated. The README mentions Python 2/3 compatibility, but modern usage would likely focus on Python 3. No explicit mention of GPU support or CUDA requirements is made, though TensorFlow typically benefits from them.
6 years ago
Inactive