LightZero by opendilab

MCTS/RL toolkit for decision-making problems

Created 3 years ago

1,502 stars

Top 27.3% on SourcePulse

Project Summary

LightZero is a PyTorch-based toolkit for Monte Carlo Tree Search (MCTS) combined with Deep Reinforcement Learning (RL), designed to standardize and accelerate research in general sequential decision-making scenarios. It offers a lightweight, efficient, and easy-to-understand framework for implementing and benchmarking MCTS+RL algorithms like AlphaZero and MuZero.

How It Works

LightZero's framework comprises three core modules: Model (network architecture), Policy (learning, collecting, and evaluation processes), and MCTS (tree structure and interaction with Policy). The MCTS implementation is available in both Python and C++ for performance optimization. This modular design facilitates understanding, comparison, and customization of various MCTS algorithms.

Quick Start & Requirements

Installation: pip3 install -e . after cloning the repository.
Prerequisites: Linux or macOS for compilation. Docker support is available.
Quick Start Examples: Training MuZero for CartPole, Pong, and TicTacToe are provided.
Documentation: Available for customization of environments and algorithms.

Highlighted Details

Implements key MCTS+RL algorithms: AlphaZero, MuZero, EfficientZero, Sampled MuZero, Stochastic MuZero, Gumbel MuZero.
Supports a wide range of environments including classic control, Atari, MuJoCo, and board games.
Features mixed heterogeneous computing for MCTS efficiency.
Provides detailed documentation, framework diagrams, and function call graphs.

Maintenance & Community

The project is actively maintained, with recent updates and a clear versioning scheme. Community interaction is encouraged via GitHub issues, a discussion forum, and a Discord server.

Licensing & Compatibility

All code is released under the Apache License 2.0, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

Compilation is currently limited to Linux and macOS; Windows support is in progress. Some algorithm/environment combinations are marked as "Work In Progress" (🔒).

LightZero by opendilab

Explore Similar Projects

EasyReinforcementLearning by alibaba

awesome-reinforcement-learning by tinyzqh

AI-Toolbox by Svalorzen

rStar by microsoft

DeepRL-TensorFlow2 by archsyscall

Popular-RL-Algorithms by quantumiracle

chainerrl by chainer

all-rl-algorithms by FareedKhan-dev

Hands-On-Reinforcement-Learning-With-Python by sudharsan13296

DeepRL by ShangtongZhang

tianshou by thu-ml

baselines by openai