cleanrl by vwxyzjn

RL algorithms implementation with research-friendly features

Created 6 years ago

8,787 stars

Top 5.9% on SourcePulse

View on GitHub

8 Experts Love This Project

Alex Yu

Research Scientist at OpenAI; Cofounder of Luma AI

John Yang

Coauthor of SWE-bench, SWE-agent

Lysandre Debut

Chief Open-Source Officer at Hugging Face

Wei-Lin Chiang

Cofounder of LMArena

and 4 more!

Project Summary

CleanRL provides high-quality, single-file implementations of popular Deep Reinforcement Learning algorithms, targeting researchers and practitioners who need clear, understandable, and reproducible code. It offers a research-friendly environment with features like Tensorboard logging, local reproducibility, and cloud integration, enabling efficient experimentation and prototyping.

How It Works

CleanRL's core philosophy is to encapsulate each algorithm variant within a single, standalone Python file. This approach prioritizes clarity and ease of understanding over modularity, allowing users to grasp all implementation details without navigating complex class hierarchies. This design choice facilitates rapid prototyping and debugging of advanced features.

Quick Start & Requirements

Install: poetry install or pip install -r requirements/requirements.txt (with optional dependencies for specific environments like Atari, MuJoCo, Procgen, etc.).
Prerequisites: Python >=3.7.1,<3.11, Poetry 1.2.1+.
Run: poetry run python cleanrl/ppo.py --env-id CartPole-v0 --total-timesteps 50000
Docs: https://cleanrl.dev/
Benchmark: https://benchmark.cleanrl.dev/

Highlighted Details

Benchmarked implementations for 7+ algorithms across 34+ games.
Integrates with Weights and Biases for experiment tracking.
Supports capturing gameplay videos.
Offers cloud integration via Docker and AWS Batch.

Maintenance & Community

Active development with a Discord community for support.
Participates in the Open RL Benchmark project.
Past video recordings available on YouTube.

Licensing & Compatibility

MIT License.
Compatible with commercial use and closed-source linking.

Limitations & Caveats

CleanRL is not designed as a modular library and involves code duplication across algorithm implementations. The project is migrating to Gymnasium, with ongoing progress tracked in issue #277. Some optimizations, like envpool for Atari, are Linux-specific.

Health Check

Last Commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

296 stars in the last 30 days