ray icon indicating copy to clipboard operation
ray copied to clipboard

Check numerics in environments at runtime

Open simonsays1980 opened this issue 3 years ago • 0 comments

Motivated by a larger error in my project and inspired by the VecCheckNan class in stable-baselines I wrote an environment wrapper for the BaseEnv in RLlib that can check numerics of an environment during runtime.

In a related issue (#31430) I suggested this feature request following the discussion with @sven1977 in issue #30471.

The wrapper can be activated by a further configuration parameter in the AlgorithmConfig and wraps a BaseEnv in such a way that it wraps the three main methods poll(), send_actions() and try_reset() (all other attributes are adopted from the wrapped environment).

The wrapper's main functionality is to check during

  • poll(): the observations and rewards for each environment and agent,
  • try_reset(): the observations for each environment and agent,
  • send_actions(): the actions for each environment and agent.

In case Nan/Infs are found a message is printed showing the faulty values as well as their previous counterpart that might have caused them (e.g. for a faulty observation the last action).

There are three configuraion parameters to modify the wrapper's behavior:

  • raise_exception: Raise an exception when an NaN/Inf is encountered. Default False.
  • check_inf: Check also for Inf values. Default True.
  • warn_once: Warn only at the first occurrence of an Nan/Inf. Default False.

NOTE: So far it only the previous counterpart for the same agent in a MultiAgentEnv is provided, i.e. agent0 action for a faulty agent0 observation. If there might be the case where the action of another agent might cause a faulty value for an agent, we have to add a further behavior for it.

TESTS: I wrote some tests for the wrapper to check its behavior. All are passing together with some other environment tests.

Related issue number

#31430

Checks

  • [x] I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • [x] I've run scripts/format.sh to lint the changes in this PR.
  • [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
  • [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • [x] Unit tests
    • [ ] Release tests
    • [ ] This PR is not tested :(

simonsays1980 avatar Jan 04 '23 12:01 simonsays1980