[INVESTIGATION] MADDPG/MAD4PG are slower than MAPPO in certain instances (pixel-based environments?)

Open AsadJeewa opened this issue 3 years ago • 0 comments

What do you want to investigate?

MADDPG/ MAD4PG are both significantly slower than MAPPO in certain instances.

Run MADDPG on Coop pong/ PCB Grid for n steps with the same network size
Run MAAPO on Coop pong/ PCB Grid for n steps with the same network size
Observe that MAPPO takes significantly shorter to run the same number of executor steps PPO ~2:45 hours for 2e6 executor steps D4PG ~36 hours for 2e6 executor steps

NB: This could be in pixel-based environments or due to alternate environment characterisitcs. It could also be a launchpad issue since significantly more evaluator steps are being run for MADDPG when not setting an interval.

Definition of done

Baseline experiments highlighting specific characteristics/ instances that show a clear difference in performance Bug fixed (in system or env) if one exists

[Optional] Results

What was the conclusion of your investigation?

[Optional] Discussion/Future Investigations

Apr 20 '22 09:04 AsadJeewa