maddpg  by openai

MADDPG research paper implementation

created 7 years ago
1,843 stars

Top 24.0% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides the implementation of the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm, designed for mixed cooperative-competitive environments. It is primarily intended for researchers and practitioners in multi-agent reinforcement learning, offering a framework to experiment with and apply MADDPG to scenarios like the Multi-Agent Particle Environments (MPE).

How It Works

MADDPG is an actor-critic algorithm that extends DDPG to multi-agent settings. Each agent has its own actor and critic. The critic for each agent takes the observations and actions of all agents as input, allowing it to learn a value function that accounts for the actions of others. This centralized critic training, combined with decentralized execution, enables agents to learn effective policies in complex, dynamic environments.

Quick Start & Requirements

  • Install: pip install -e .
  • Dependencies: Python (3.5.4), OpenAI gym (0.10.5), tensorflow (1.8.0), numpy (1.14.5). MPE environment code must be downloaded and added to PYTHONPATH.
  • Run: python experiments/train.py --scenario <scenario_name> (e.g., python experiments/train.py --scenario simple)
  • Docs: Multi-Agent Particle Environments (MPE)

Highlighted Details

  • Implements MADDPG for mixed cooperative-competitive environments.
  • Designed to work with the Multi-Agent Particle Environments (MPE).
  • Supports configurable numbers of agents and policy types (MADDPG or DDPG).

Maintenance & Community

  • Status: Archived (code provided as-is, no updates expected).
  • The original implementation for policy ensemble and policy estimation is available at a separate link.

Licensing & Compatibility

  • License: Not explicitly stated in the README.
  • Compatibility: Requires older versions of TensorFlow (1.8.0) and OpenAI gym (0.10.5), which may pose compatibility challenges with modern systems.

Limitations & Caveats

The repository is archived, meaning no further updates or bug fixes are expected. The code structure has been modified since the original paper, and results may vary from those reported. The strict dependency on older library versions (TensorFlow 1.8.0) is a significant barrier to adoption for current projects.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
65 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.