EasyReinforcementLearning by alibaba

Scalable reinforcement learning package

Created 6 years ago

254 stars

Top 99.0% on SourcePulse

Project Summary

This package provides an easy-to-use and comprehensive reinforcement learning (RL) framework, addressing the complexity of implementing sophisticated RL algorithms for real-world applications. It targets practitioners seeking to apply RL with minimal effort, offering a unified approach to standalone and distributed RL algorithm development, particularly beneficial for e-commerce and interactive scenarios.

How It Works

EasyRL is built entirely on TensorFlow, leveraging its computation graph for both processing and distributed communication. It employs a flexible actor-learner architecture, abstracting processes into roles: Actor (data collection), Learner (model updates), Buffer (sample management), and Parameter Server (model storage). This design facilitates easy study, integration, and migration across platforms, enabling the expressiveness to develop both on-policy and off-policy distributed RL algorithms with comparable or superior performance to state-of-the-art packages.

Quick Start & Requirements

Installation: Clone the repository (git clone https://github.com/alibaba/EasyRL.git && cd EasyRL) and install using pip install -e . --verbose.
Prerequisites: TensorFlow is the core dependency. Specific hardware requirements (e.g., GPU, CUDA) are not explicitly stated for basic installation but are implied for achieving reported performance metrics.
Demo: An example run_dqn_on_pong.py is provided in the demo/ directory.
Documentation: A link to more comprehensive documentation is mentioned but currently has a theme-related issue.

Highlighted Details

Offers a comprehensive suite of popular RL algorithms including DQN, PPO, ES, Rainbow, DDPG, ApeX, and IMPALA, supporting 8 functionalities compared to other packages.
Empirical evaluations demonstrate competitive or superior performance, achieving fast convergence on tasks like solving Pong within minutes, outperforming some state-of-the-art RL packages in specific configurations.
The actor-learner architecture is expressive enough to handle both on-policy and off-policy algorithms without altering the fundamental responsibilities of each process role.
Includes RL-oriented summary functionalities that simplify tracking TensorFlow operations without manual hook management.

Maintenance & Community

The provided README does not contain specific details regarding notable contributors, sponsorships, community channels (e.g., Discord, Slack), or a public roadmap.

Licensing & Compatibility

The license type is not explicitly stated in the README.

Limitations & Caveats

A "theme-related issue" is noted for the comprehensive documentation link, which is slated for a fix. Performance claims are contingent on specific experimental setups and hardware configurations.

Health Check

Last Commit

4 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days