Sergio Guadarrama

Results 68 comments of Sergio Guadarrama

You also need to use the `PPOPolicy` which will collect and the proper extra information. Take a look at https://github.com/tensorflow/agents/blob/master/tf_agents/examples/ppo/schulman17/train_eval_lib.py

Try ``` pip install -force-reinstall tf-agents[reverb] ```

If you don't pass the observation_action_splitter to DQN then the observation contains both the observation and the mask. And the mask is bool which cannot be sampled. Not sure what...

Also can you make sure you have the latest version since sample_nest_spec can now handle tf.bool https://github.com/tensorflow/agents/blob/master/tf_agents/specs/tensor_spec.py#L380

You should be able to use replay_buffers or any other utils from TF-Agents, although you probably need to clone the repository and use it directly, instead of `pip install` it.

If you are interested in creating a PR which add an install option that avoids installing mujoco `pip3 install tf-agents[nomujoco]` we can review it.

As you can see in the error message the shape of the `observation` has changed from `[5, 5, 3]` to `[3, 5, 2]` so I suppose there is an error...

Can you try updating using `pip install tf_agents[reverb]`?

Probably due to mismatch version of TF.

SAC is not designed for discrete actions, maybe it would better to try PPO.