rlkit  by rail-berkeley

RL algorithm collection implemented in PyTorch

Created 7 years ago
2,766 stars

Top 17.2% on SourcePulse

GitHubView on GitHub
Project Summary

RLkit is a PyTorch-based reinforcement learning framework offering a collection of state-of-the-art algorithms for researchers and practitioners. It aims to provide a modular and readable codebase for implementing and experimenting with advanced RL techniques, including goal-conditioned learning and meta-learning.

How It Works

RLkit is built on PyTorch, emphasizing modularity and code readability. Key architectural changes in version 0.2 include switching to native torch.nn.Module, removing custom serialization classes for standard pickle, and refactoring training and sampling logic into separate objects. This batch-style training approach enhances parallelization capabilities. The framework supports various algorithms like Soft Actor-Critic (SAC), Twin Delayed Deep Deterministic Policy Gradient (TD3), Hindsight Experience Replay (HER), and Skew-Fit.

Quick Start & Requirements

  • Install using Anaconda: conda env create -f environment/[linux-cpu|linux-gpu|mac]-env.yml followed by source activate rlkit.
  • Run examples: python examples/ddpg.py.
  • Prerequisites: MuJoCo 1.5, gym 0.10.5. GPU support requires CUDA.
  • Optional: Install multiworld for specific Sawyer environment experiments.
  • Docker image available for portability and rendering issues.
  • GPU usage: ptu.set_gpu_mode(True) or use_gpu=True with doodad.
  • Documentation: example scripts, v0.1.2 legacy docs.

Highlighted Details

  • Implements algorithms such as SAC, TD3, HER, Skew-Fit, RIG, and AWAC.
  • Features batch-style training for improved parallelization.
  • Includes a doodad integration for launching experiments on AWS/GCP.
  • Offers visualization tools via viskit for policy evaluation.

Maintenance & Community

  • Initially developed by Vitchyr Pong, now maintained by RAIL Berkeley, primarily by Ashvin Nair.
  • Significant contributions from Murtaza Dalal and Steven Lin.
  • Infrastructure based on rllab and Dockerfile based on OpenAI's mujoco-py.

Licensing & Compatibility

  • The README does not explicitly state a license. However, its origins and dependencies suggest a permissive license, likely compatible with commercial use.

Limitations & Caveats

  • Some algorithms like Temporal Difference Models (TDMs) and original RIG are only available in older versions (v0.1.2).
  • Mac environment testing was done without a GPU.
  • AWS/GCP integration via doodad requires external setup knowledge.
Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
26 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.