gbuonamico
gbuonamico
_**Hello, thank you for replying. No because using LSTM in actor and value Network needs an additional dimension. The train works fine. This is the portion of code I use...
Hello, any suggestion will be appreciated....
Not a problem. I think the problem come from the use of the wrapper "train_step=common.function(train_step)" in the training phase. keep this in mind tomorrow and let me know please...
I agree on the fact that time_step and policy_state are not aligned. time_step is **batched**, while policy_state (initial state) is **not**. If I not include the additional dimension in the...
Sorry but I don't understand what you mean. In my environment the definition of action_spec and observation_spec are the following ****************************************************************************************** **self._action_spec = array_spec.BoundedArraySpec( shape=(), dtype=np.int32, minimum=0, maximum=2, name='action') ns=(1,self.shape[0],self.shape[1])...
Thats what I did as you suggested, and where I got the error I was talking about in my previous comment "ValueError: Shapes (960, 18) and (1, 960, 18) are...
Well, that s not possible as the environment needs a database and additional procedures to run. But it's a standard python environment wrapped into a TFEnvironment. No changes are made...
Thank you for your answer. At the end I m using this workaround (in bold in the code. Not sure is great, but seems to work) t_step = tf_environment.reset() t_step=tf.expand_dims(t_step.observation,axis=0)...
But I remain frustrated for not really understanding what's the root problem.... Thank you for your patience