bsuite  by google-deepmind

RL benchmark suite for core capability investigation

created 6 years ago
1,524 stars

Top 27.7% on sourcepulse

GitHubView on GitHub
Project Summary

bsuite is a benchmark suite for evaluating core reinforcement learning (RL) agent capabilities. It provides a standardized set of carefully designed experiments to facilitate reproducible research and the development of more general and efficient RL algorithms. The target audience includes RL researchers and practitioners seeking to rigorously assess and compare agent performance across fundamental challenges.

How It Works

bsuite comprises a collection of RL environments, each defined with specific configurations and analysis plots. Experiments are instrumented to log results automatically using load_and_record* functions, outputting data in a format compatible with a provided Jupyter notebook for analysis. This approach ensures that any agent can be evaluated without structural constraints, simplifying the process of benchmarking and comparison.

Quick Start & Requirements

  • Install via pip: pip install bsuite
  • For baseline examples: pip install bsuite[baselines]
  • Tested on Python 3.6 & 3.7.
  • Colab tutorial available: bit.ly/bsuite-colab

Highlighted Details

  • Environments feature small observation sizes, enabling reasonable performance on CPUs with small networks.
  • Supports logging to CSV, SQLite, or terminal, with custom logging easily implementable.
  • Includes a utility to wrap environments for OpenAI Gym compatibility.
  • Provides baseline agent implementations and scripts for running experiments, including parallel execution and Google Cloud Platform integration.

Maintenance & Community

Licensing & Compatibility

  • Apache 2.0 License.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

The project specifies compatibility with Python 3.6 & 3.7, suggesting potential issues with newer Python versions. Baseline dependencies are not installed by default, requiring explicit installation for users who wish to utilize them.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Simon Willison Simon Willison(Author of Django), and
1 more.

tau-bench by sierra-research

2.6%
709
Benchmark for tool-agent-user interaction research
created 1 year ago
updated 2 weeks ago
Feedback? Help us improve.