decision-transformer by kzl

RL via sequence modeling research paper

Created 4 years ago

2,744 stars

Top 17.2% on SourcePulse

View on GitHub

7 Experts Love This Project

Aravind Srinivas

Cofounder of Perplexity

Jeff Hammerbacher

Cofounder of Cloudera

Chenlin Meng

Cofounder of Pika

Jiaming Song

Chief Scientist at Luma AI

and 3 more!

Project Summary

This repository provides the official codebase for Decision Transformer, a method that frames Reinforcement Learning (RL) as a sequence modeling problem. It is intended for researchers and practitioners in RL and deep learning who are interested in applying transformer architectures to sequential decision-making tasks. The primary benefit is enabling RL agents to learn from historical trajectories using standard sequence modeling techniques.

How It Works

Decision Transformer leverages the transformer architecture, commonly used in Natural Language Processing, to model sequences of states, actions, and rewards. Instead of traditional RL algorithms that rely on value functions or policy gradients, it treats RL as a conditional sequence generation problem. The model predicts future actions based on a sequence of past states, actions, and desired future returns, effectively learning a policy that aims to achieve a specified level of performance.

Quick Start & Requirements

Install: Clone the repository and follow instructions within the atari or gym subdirectories. Add respective directories to PYTHONPATH.
Prerequisites: Python, PyTorch, OpenAI Gym, Atari environments. Specific dependencies are detailed in the sub-directory READMEs.
Resources: Requires significant computational resources for training, especially for Atari experiments.
Links: arXiv Paper

Highlighted Details

Reproduces experiments from the Decision Transformer paper.
Codebase split into atari and gym for distinct experiment types.
Applies transformer architecture to RL problems.

Maintenance & Community

The project is associated with authors from leading research institutions. No specific community channels or active maintenance signals are provided in the README.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

The README states this is not an official Google or Facebook product. The codebase is primarily for reproducing paper experiments, and may require significant effort to adapt for novel applications or different RL environments.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

25 stars in the last 30 days