bsuite  by google-deepmind

RL benchmark suite for core capability investigation

Created 6 years ago
1,526 stars

Top 27.2% on SourcePulse

GitHubView on GitHub
Project Summary

bsuite is a benchmark suite for evaluating core reinforcement learning (RL) agent capabilities. It provides a standardized set of carefully designed experiments to facilitate reproducible research and the development of more general and efficient RL algorithms. The target audience includes RL researchers and practitioners seeking to rigorously assess and compare agent performance across fundamental challenges.

How It Works

bsuite comprises a collection of RL environments, each defined with specific configurations and analysis plots. Experiments are instrumented to log results automatically using load_and_record* functions, outputting data in a format compatible with a provided Jupyter notebook for analysis. This approach ensures that any agent can be evaluated without structural constraints, simplifying the process of benchmarking and comparison.

Quick Start & Requirements

  • Install via pip: pip install bsuite
  • For baseline examples: pip install bsuite[baselines]
  • Tested on Python 3.6 & 3.7.
  • Colab tutorial available: bit.ly/bsuite-colab

Highlighted Details

  • Environments feature small observation sizes, enabling reasonable performance on CPUs with small networks.
  • Supports logging to CSV, SQLite, or terminal, with custom logging easily implementable.
  • Includes a utility to wrap environments for OpenAI Gym compatibility.
  • Provides baseline agent implementations and scripts for running experiments, including parallel execution and Google Cloud Platform integration.

Maintenance & Community

Licensing & Compatibility

  • Apache 2.0 License.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

The project specifies compatibility with Python 3.6 & 3.7, suggesting potential issues with newer Python versions. Baseline dependencies are not installed by default, requiring explicit installation for users who wish to utilize them.

Health Check
Last Commit

1 year ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Morgan Funtowicz Morgan Funtowicz(Head of ML Optimizations at Hugging Face), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
7 more.

lighteval by huggingface

2.6%
2k
LLM evaluation toolkit for multiple backends
Created 1 year ago
Updated 1 day ago
Starred by Alex Yu Alex Yu(Research Scientist at OpenAI; Former Cofounder of Luma AI), John Yang John Yang(Coauthor of SWE-bench, SWE-agent), and
6 more.

cleanrl by vwxyzjn

0.5%
8k
RL algorithms implementation with research-friendly features
Created 6 years ago
Updated 2 months ago
Feedback? Help us improve.