Sergio Guadarrama comments

Results 68 comments of


                                            Sergio Guadarrama

Using Actor- Learner API and reverb for PPO agent

You also need to use the `PPOPolicy` which will collect and the proper extra information. Take a look at https://github.com/tensorflow/agents/blob/master/tf_agents/examples/ppo/schulman17/train_eval_lib.py

Reverb issue with DQN tutorial

Try ``` pip install -force-reinstall tf-agents[reverb] ```

Help with Observation Masking

If you don't pass the observation_action_splitter to DQN then the observation contains both the observation and the mask. And the mask is bool which cannot be sampled. Not sure what...

Help with Observation Masking

Also can you make sure you have the latest version since sample_nest_spec can now handle tf.bool https://github.com/tensorflow/agents/blob/master/tf_agents/specs/tensor_spec.py#L380

Reduce heavy mandatory dependencies for simple functionality

You should be able to use replay_buffers or any other utils from TF-Agents, although you probably need to clone the repository and use it directly, instead of `pip install` it.

Reduce heavy mandatory dependencies for simple functionality

If you are interested in creating a PR which add an install option that avoids installing mujoco `pip3 install tf-agents[nomujoco]` we can review it.

Received a mix of batched and unbatched Tensors, or Tensors are not compatible with Specs.

As you can see in the error message the shape of the `observation` has changed from `[5, 5, 3]` to `[3, 5, 2]` so I suppose there is an error...

Error when saving model with PolicySaver

Can you try updating using `pip install tf_agents[reverb]`?

Error when saving model with PolicySaver

Probably due to mismatch version of TF.

SAC for discrete action (GumbelSoftmax reparameterization trick)

SAC is not designed for discrete actions, maybe it would better to try PPO.