tianshou by thu-ml

PyTorch RL library for algorithm development and application

Created 7 years ago

9,039 stars

Top 5.7% on SourcePulse

View on GitHub

8 Experts Love This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Pawel Garbacki

Cofounder of Fireworks AI

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Kaichao You

Core Maintainer of vLLM

and 4 more!

Project Summary

Tianshou is a comprehensive, modular, and high-performance deep reinforcement learning library built on PyTorch and Gymnasium. It targets both RL researchers seeking flexible, hackable interfaces for algorithm development and practitioners needing user-friendly tools for applying RL to custom environments. Tianshou offers a wide range of supported algorithms, including online, offline, and experimental multi-agent and model-based RL, aiming to enable concise and efficient implementations.

How It Works

Tianshou features a dual API design: a high-level, declarative API for ease of use in applications, and a low-level, procedural API for maximum flexibility in algorithm development. This modularity allows for easy integration of new algorithms and customization of training processes. It supports vectorized environments, recurrent state representations, and various state/action types, all while emphasizing performance through optimized components like Numba-compiled JIT operations for n-step returns and prioritized experience replay.

Quick Start & Requirements

Installation: poetry install (recommended for latest version) or pip install tianshou (PyPI, potentially outdated). Install extras like poetry install --extras "mujoco envpool" for specific functionalities.
Prerequisites: Python >= 3.11. Optional extras include atari, box2d, classic_control, mujoco, pybullet, robotics, vizdoom, envpool, argparse.
Documentation: tianshou.readthedocs.io
Examples: examples/ folder

Highlighted Details

Achieves state-of-the-art results in MuJoCo benchmarks for multiple algorithms.
Supports highly optimized vectorized environments via EnvPool.
Provides a general policy interface (__init__, forward, process_buffer, process_fn, learn, post_process_fn, update) for straightforward algorithm experimentation.
Offers comprehensive testing, including full agent training procedures, to ensure reproducibility.

Maintenance & Community

Supported by appliedAI Institute for Europe.
Actively developed with continuous addition of algorithms and features. Contribution guidelines are available.
Citations are encouraged via a JMLR publication.

Licensing & Compatibility

MIT License.
Compatible with commercial and closed-source applications.

Limitations & Caveats

The mujoco-py extra is for legacy compatibility and may have issues with newer macOS versions.
The PyPI installation is noted as being "far behind the master" branch.

Health Check

Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

57 stars in the last 30 days