phasic-policy-gradient icon indicating copy to clipboard operation
phasic-policy-gradient copied to clipboard

Does this implementation of ppg support only discrete observation and action types?

Open RajS999 opened this issue 4 years ago • 0 comments

I was trying out this code with custom gym environment (a non gaming, timeseries environment tested to work with baselines) converted to gym3 environment using gym3.interop.FromGymEnv() and ended up getting following error:

Expected ScalarType, got <class 'gym3.types.TensorType'>
  File "/home/user/workspace/phasic_policy_gradient/distr_builder.py", line 35, in tensor_distr_builder
    raise ValueError(f"Expected ScalarType, got {type(ac_space)}")
  File "/home/user/workspace/phasic_policy_gradient/distr_builder.py", line 47, in distr_builder
    return tensor_distr_builder(ac_type)
  File "/home/user/workspace/phasic_policy_gradient/ppg.py", line 97, in __init__
    pi_outsize, self.make_distr = distr_builder(actype)
  File "/home/user/workspace/phasic_policy_gradient/train.py", line 58, in train_fn
    model = ppg.PhasicValueModel(venv.ob_space, venv.ac_space, enc_fn, arch=arch)

I debugged and found that venv.ac_space is of type R[30] and venv.ob_space is of type R[301]. (I made some changes to use this implementation in non-gaming environment / timeseries environment. Added MlpEncoder to replace ImpalaEncoder etc.) This is because my custom gym environment has observation_space of type Box(-inf, inf, (301,), float32) and action_space of type Box(-inf, inf, (301,), float32) and gets converted to gym3.types.Real. And it seems that ppg distr_builder allows only Discrete observation and action spaces. Is it so?

RajS999 avatar May 07 '21 12:05 RajS999