Summer Yue
Summer Yue
Ummm could you print out `env.action_spec()` for your environment?
Could you please elaborate? I see that the `l2_regularization_loss` includes the regularization losses and was added to total_loss. The coefficients are default to 0, but if you wish to use...
Oh I see. Thanks for explaining. So you are pointing at the inconsistency in the implementation between PPO and other agents in terms of where losses are calculated, not that...
Sorry for the delayed response. We are not sure why this inconsistency exists, probably be some historical reasons. We could look into this once we get some bandwidth. Also feel...
Thank you for reporting. It's a little hard to know exactly what's going on. Could you help print out both `action_output_spec` and `action_spec` so we know why it doesn't match?
Thanks for providing the addition information! I think you're right. I was able to reproduce your issue in a simple example in Colab. I'll follow up here with a more...
Underneath you're using tf.train.CheckpointManager.save(). Your question is how to inspect a checkpointed file from Tensorflow. There's been some discussions about potentially using the `inspect_checkpoint.py` tool http://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/inspect_checkpoint.py (full disclosure I haven't...
Should your observation spec in your environment be TensorShape([960, 18]) instead of TensorShape([1, 960, 18])?
Sorry about the delay. Let me take a closer look at this today afternoon.
The ValueError you're seeing is saying that your input time_step into `policy.action` is not aligned with the spec it is expecting. Could you try not including the additional dimension in...