ray icon indicating copy to clipboard operation
ray copied to clipboard

[RLLib] Fix action masking example

Open drblallo opened this issue 1 year ago • 0 comments

Why are these changes needed?

Hi, i have been trying to use the action masking example. It works fine in single agent scenarios, but it is broken when using it in multi agents ones.

I am not sure this fixes are the best possible ones, almost surely the fix in the second commit is not optimal.

The first commit adds the method _compute_values to the masking example, which is needed is a multi agent context. I don't know why exactly it is the case and whether or not this was intended to happen or it is actually a error in the multi agent code.

The second commit removes the check about the input of a action masked module which is intended to be a spaces.Dict, but when performing a load_state, the module is created by passing to it directly the content of the spaces.Dict instead

I believe that the issue is that when writing on disk it ends up writing the serialized state of the underlying ppo module, which has never seen the spaces.Dict since it was unwrapped at creation time. I have been unable to understand if this the real issue or not, and i would not know how to fix it.

[x] I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR. [] I've run scripts/format.sh to lint the changes in this PR. [] I've included any doc changes needed for https://docs.ray.io/en/master/. [] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in doc/source/tune/api/ under the corresponding .rst file. [] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ [] Testing Strategy [] Unit tests [] Release tests [x] This PR is not tested :(

drblallo avatar Apr 08 '24 13:04 drblallo