rlkit by rail-berkeley

RL algorithm collection implemented in PyTorch

Created 8 years ago

2,844 stars

Top 16.6% on SourcePulse

View on GitHub

7 Experts Love This Project

Aravind Srinivas

Cofounder of Perplexity

Misha Laskin

Cofounder of Reflection AI

Nathan Lambert

Research Scientist at AI2

Jerry Tworek

VP Research at OpenAI

and 3 more!

Project Summary

RLkit is a PyTorch-based reinforcement learning framework offering a collection of state-of-the-art algorithms for researchers and practitioners. It aims to provide a modular and readable codebase for implementing and experimenting with advanced RL techniques, including goal-conditioned learning and meta-learning.

How It Works

RLkit is built on PyTorch, emphasizing modularity and code readability. Key architectural changes in version 0.2 include switching to native torch.nn.Module, removing custom serialization classes for standard pickle, and refactoring training and sampling logic into separate objects. This batch-style training approach enhances parallelization capabilities. The framework supports various algorithms like Soft Actor-Critic (SAC), Twin Delayed Deep Deterministic Policy Gradient (TD3), Hindsight Experience Replay (HER), and Skew-Fit.

Quick Start & Requirements

Install using Anaconda: conda env create -f environment/[linux-cpu|linux-gpu|mac]-env.yml followed by source activate rlkit.
Run examples: python examples/ddpg.py.
Prerequisites: MuJoCo 1.5, gym 0.10.5. GPU support requires CUDA.
Optional: Install multiworld for specific Sawyer environment experiments.
Docker image available for portability and rendering issues.
GPU usage: ptu.set_gpu_mode(True) or use_gpu=True with doodad.
Documentation: example scripts, v0.1.2 legacy docs.

Highlighted Details

Implements algorithms such as SAC, TD3, HER, Skew-Fit, RIG, and AWAC.
Features batch-style training for improved parallelization.
Includes a doodad integration for launching experiments on AWS/GCP.
Offers visualization tools via viskit for policy evaluation.

Maintenance & Community

Initially developed by Vitchyr Pong, now maintained by RAIL Berkeley, primarily by Ashvin Nair.
Significant contributions from Murtaza Dalal and Steven Lin.
Infrastructure based on rllab and Dockerfile based on OpenAI's mujoco-py.

Licensing & Compatibility

The README does not explicitly state a license. However, its origins and dependencies suggest a permissive license, likely compatible with commercial use.

Limitations & Caveats

Some algorithms like Temporal Difference Models (TDMs) and original RIG are only available in older versions (v0.1.2).
Mac environment testing was done without a GPU.
AWS/GCP integration via doodad requires external setup knowledge.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

12 stars in the last 30 days