maddpg-pytorch  by shariqiqbal2810

PyTorch implementation of the MADDPG multi-agent RL algorithm

created 7 years ago
644 stars

Top 52.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a PyTorch implementation of the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm, designed for training agents in mixed cooperative-competitive environments. It is suitable for researchers and practitioners exploring multi-agent reinforcement learning strategies.

How It Works

The implementation follows the MADDPG framework, which extends DDPG to multi-agent settings by using a centralized critic that observes the actions and states of all agents, while each agent maintains its own decentralized actor. This approach allows for stable learning in complex, non-stationary environments where individual agent policies are constantly changing.

Quick Start & Requirements

  • Primary install/run command: python main.py --help
  • Prerequisites: PyTorch (0.3.0.post4), OpenAI Gym (0.9.4), Tensorboard (0.4.0rc3), Tensorboard-Pytorch (1.0). OpenAI baselines (commit hash: 98257ef8c9bd23a24a330731ae54ed086d9ce4a7) and a fork of Multi-agent Particle Environments are also mentioned.
  • Links: OpenAI baselines, Multi-agent Particle Environments

Highlighted Details

  • Implements MADDPG for cooperative-competitive scenarios like Physical Deception and Cooperative Communication.
  • Includes a Predator-Prey environment with a single prey and three predators.
  • Incorporates techniques like gradient norm clipping and policy regularization.

Maintenance & Community

  • The project is a personal fork, with no explicit mention of active maintenance or community channels.

Licensing & Compatibility

  • The README does not specify a license. Compatibility for commercial use or closed-source linking is not addressed.

Limitations & Caveats

The implementation is a personal fork and may not reflect the latest advancements or best practices. Features like ensemble training, inferring other agents' policies, and mixed continuous/discrete action spaces are explicitly noted as not implemented. The specified dependency versions are from the time of use and may not be strict requirements.

Health Check
Last commit

5 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
20 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.