agents
agents copied to clipboard
ValueError: Exception encountered when calling layer "QNetwork" (Issues with Q-Networks)
Hi, I'm training with Q Networks. This is the first time I'll be using Q Networks so I am not sure why I am getting an error. Here's my code so far:
# the code for getting the environment, as well as imports lines have been omitted
environment = wrappers.RunStats(environment)
tf_env = tf_py_environment.TFPyEnvironment(environment)
time_step = tf_env.reset()
input_tensor_spec = tensor_spec.TensorSpec((1,3), tf.float64)
time_step_spec = ts.time_step_spec(input_tensor_spec)
action_spec = tensor_spec.BoundedTensorSpec((),
tf.int32,
minimum=0,
maximum=2)
num_of_actions = action_spec.maximum - action_spec.minimum + 1
batch_size = 1
observation = tf.ones((1,3), dtype=tf.float64)
time_steps = ts.restart(observation, batch_size=batch_size)
my_q_network = q_network.QNetwork(
input_tensor_spec=input_tensor_spec,
action_spec=action_spec
)
my_q_policy = q_policy.QPolicy(
time_step_spec, action_spec, q_network=my_q_network)
action_step = my_q_policy.action(time_steps)
But I somehow get an error in the last line above. Here is the error message:
ValueError: Exception encountered when calling layer "QNetwork" " f"(type QNetwork).
Input 0 of layer "dense_2" is incompatible with the layer: expected min_ndim=2, found ndim=1. Full shape received: (40,)
Call arguments received by layer "QNetwork" " f"(type QNetwork):
• observation=tf.Tensor(shape=(2, 3), dtype=float64)
• step_type=tf.Tensor(shape=(2,), dtype=int32)
• network_state=()
• training=False
I believe there may be wrong with the time_steps. Here's a printout of it:
TimeStep(
{'discount': <tf.Tensor: shape=(1,), dtype=float32, numpy=array([1.], dtype=float32)>,
'observation': <tf.Tensor: shape=(1, 3), dtype=float64, numpy=array([[1., 1., 1.]])>,
'reward': <tf.Tensor: shape=(1,), dtype=float32, numpy=array([0.], dtype=float32)>,
'step_type': <tf.Tensor: shape=(1,), dtype=int32, numpy=array([0], dtype=int32)>})
Clearly, the observation/input is 2-dimensional but it is still complaining that it only found 1-dimension. Thanks in advance.
You might be missing the batch dimension in your input observation = tf.ones((1,3), dtype=tf.float64), try adding a batch dimension to this input?