maddpg by openai

MADDPG research paper implementation

Created 8 years ago

1,928 stars

Top 22.4% on SourcePulse

View on GitHub

1 Expert Loves This Project

Luca Antiga

CTO of Lightning AI

Project Summary

This repository provides the implementation of the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm, designed for mixed cooperative-competitive environments. It is primarily intended for researchers and practitioners in multi-agent reinforcement learning, offering a framework to experiment with and apply MADDPG to scenarios like the Multi-Agent Particle Environments (MPE).

How It Works

MADDPG is an actor-critic algorithm that extends DDPG to multi-agent settings. Each agent has its own actor and critic. The critic for each agent takes the observations and actions of all agents as input, allowing it to learn a value function that accounts for the actions of others. This centralized critic training, combined with decentralized execution, enables agents to learn effective policies in complex, dynamic environments.

Quick Start & Requirements

Install: pip install -e .
Dependencies: Python (3.5.4), OpenAI gym (0.10.5), tensorflow (1.8.0), numpy (1.14.5). MPE environment code must be downloaded and added to PYTHONPATH.
Run: python experiments/train.py --scenario <scenario_name> (e.g., python experiments/train.py --scenario simple)
Docs: Multi-Agent Particle Environments (MPE)

Highlighted Details

Implements MADDPG for mixed cooperative-competitive environments.
Designed to work with the Multi-Agent Particle Environments (MPE).
Supports configurable numbers of agents and policy types (MADDPG or DDPG).

Maintenance & Community

Status: Archived (code provided as-is, no updates expected).
The original implementation for policy ensemble and policy estimation is available at a separate link.

Licensing & Compatibility

License: Not explicitly stated in the README.
Compatibility: Requires older versions of TensorFlow (1.8.0) and OpenAI gym (0.10.5), which may pose compatibility challenges with modern systems.

Limitations & Caveats

The repository is archived, meaning no further updates or bug fixes are expected. The code structure has been modified since the original paper, and results may vary from those reported. The strict dependency on older library versions (TensorFlow 1.8.0) is a significant barrier to adoption for current projects.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

19 stars in the last 30 days