gbuonamico comments

Results 9 comments of


                                            gbuonamico

Error using a trained PPO policy

_**Hello, thank you for replying. No because using LSTM in actor and value Network needs an additional dimension. The train works fine. This is the portion of code I use...

Error using a trained PPO policy

Hello, any suggestion will be appreciated....

Error using a trained PPO policy

Not a problem. I think the problem come from the use of the wrapper "train_step=common.function(train_step)" in the training phase. keep this in mind tomorrow and let me know please...

Error using a trained PPO policy

I agree on the fact that time_step and policy_state are not aligned. time_step is **batched**, while policy_state (initial state) is **not**. If I not include the additional dimension in the...

Error using a trained PPO policy

Sorry but I don't understand what you mean. In my environment the definition of action_spec and observation_spec are the following ****************************************************************************************** **self._action_spec = array_spec.BoundedArraySpec( shape=(), dtype=np.int32, minimum=0, maximum=2, name='action') ns=(1,self.shape[0],self.shape[1])...