Benjamin Black comments

Results 41 comments of


                                            Benjamin Black

MultiAgentPolicyManager misses rewards

@Trinkle23897 As for your comment that it breaks the markov property, I think this is true. I would have to create a different environment where the previous two actions are...

MultiAgentPolicyManager misses rewards

Hi, what did you mean by " instead of letting MAPM changing buffer over and over again"? Is this something that the MAPM does now, or a potential solution to...

MultiAgentPolicyManager misses rewards

Ok, I updated the example so that the env returns the previous 2 observations, so I think the appropriate markov property should now hold.

How to support multi-agent reinforcement learning

@p-veloso I just saw this, but while the supersuit example here: https://github.com/PettingZoo-Team/SuperSuit#parallel-environment-vectorization is for stable baselines, all it does is translate the parallel environment into a vector environment. Since tianshou...

[Feature Request] Multi-Agent (MA) Support / Distributed algorithms (IMPALA/APEX)

I aggree that keeping the algorithm (AC2, TD3, etc) separate from the framework (apex, parameter sharing, etc) is a powerful way of supporting a wide variety of use cases easily....

Continuous Butterfly Presets

@cpnota One thing blocking this is that several internal features, including the generalized advantage buffer used by PPO, only work for parallel agents. And there is no parallel multiagent experiment.

Apparent bug in Atari PPO preset

I tried this with A2C (same code, just with a2c) and got the following error: ``` Traceback (most recent call last): File "independent_atari.py", line 7, in experiment.train(frames=2e6) File "/home/ben/class_projs/autonomous-learning-library/all/experiments/single_env_experiment.py", line...