Skip to content

A pytorch implementation of MADDPG (multi-agent deep deterministic policy gradient)

Notifications You must be signed in to change notification settings

xuehy/pytorch-maddpg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

b7c1acf · Jun 5, 2018

History

19 Commits
Aug 21, 2017
Jul 26, 2017
Jun 5, 2018
Aug 21, 2017
Jun 5, 2018
Jul 25, 2017
Jul 25, 2017
Jul 25, 2017
Jul 25, 2017

Repository files navigation

An implementation of MADDPG

1. Introduction

This is a pytorch implementation of multi-agent deep deterministic policy gradient algorithm.

The experimental environment is a modified version of Waterworld based on MADRL.

2. Environment

The main features (different from MADRL) of the modified Waterworld environment are:

  • evaders and poisons now bounce at the wall obeying physical rules
  • sizes of the evaders, pursuers and poisons are now the same so that random actions will lead to average rewards around 0.
  • need exactly n_coop agents to catch food.

3. Dependency

  • pytorch
  • visdom
  • python==3.6.1 (recommend using the anaconda/miniconda)
  • if you need to render the environments, opencv is required

4. Install

  • Install MADRL.
  • Replace the madrl_environments/pursuit directory with the one in this repo.
  • python main.py

if scene rendering is enabled, recommend to install opencv through conda-forge.

5. Results

two agents, cooperation = 2

The two agents need to cooperate to achieve the food for reward 10.

PNG/demo.gif

PNG/3.png

the average

PNG/4.png

one agent, cooperation = 1

PNG/newplot.png

6. TODO

  • reproduce the experiments in the paper with competitive environments.

About

A pytorch implementation of MADDPG (multi-agent deep deterministic policy gradient)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages