Mava by instadeepai

MARL research codebase for fast experimentation in JAX

Created 4 years ago

870 stars

Top 41.3% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jeff Hammerbacher

Cofounder of Cloudera

Project Summary

Mava is a research-focused codebase for multi-agent reinforcement learning (MARL) in JAX, designed for rapid experimentation and scalability. It provides researchers with fast, single-file implementations of state-of-the-art MARL algorithms, enabling quick iteration on new ideas.

How It Works

Mava leverages JAX for its high-performance, automatic differentiation, and compilation capabilities, allowing for end-to-end JIT compilation of MARL training loops. It supports two distribution architectures: Anakin for JAX-based environments, enabling full JIT compilation, and Sebulba for non-JAX environments, facilitating interaction with multiple CPU cores. This approach results in significantly faster experiment runtimes compared to non-JAX alternatives.

Quick Start & Requirements

Installation: Clone the repository and install dependencies using uv sync or pip install -e ..
Prerequisites: Python 3.11 or 3.12. Users must install the correct JAX version for their hardware accelerator separately.
Getting Started: Run system files like python mava/systems/ppo/anakin/ff_ippo.py. Configuration is managed via Hydra, allowing overrides from the terminal.
Resources: A Google Colab notebook is available for a quickstart: https://colab.research.google.com/github/instadeepai/Mava/blob/develop/examples/Quickstart.ipynb

Highlighted Details

Implements state-of-the-art MARL algorithms (PPO, Q Learning, SAC, etc.) with support for independent learners, CTDE, and heterogeneous agents.
Provides wrappers for JAX-based MARL environments and supports adding new ones.
Natively supports statistically robust evaluation by logging to JSON files compatible with MARL-eval.
Offers Anakin and Sebulba distribution architectures for scaling RL systems across JAX and non-JAX environments, respectively.

Maintenance & Community

Developed by The Research Team at InstaDeep.
Roadmap includes adding more Sebulba algorithm versions and scaling across multiple TPUs/GPUs.
Related repositories include OG-MARL, Jumanji, Matrax, Flashbax, and MARL-eval.

Licensing & Compatibility

License: Apache 2.0.
Compatible with commercial use and closed-source linking.

Limitations & Caveats

Mava is not designed as a modular library and is intended to be used directly from the cloned repository. While it supports various environments, adding new ones requires using existing wrappers as a guide.

Health Check

Last Commit

3 weeks ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

10 stars in the last 30 days