addons F1Score throws ValueError

System information

OS Platform and Distribution: Linux Ubuntu 18.04
TensorFlow version and how it was installed (source or binary): 2.3.1, from pip
TensorFlow-Addons version and how it was installed (source or binary): 0.11.2, from pip
Python version: 3.8.2
Is GPU used? (yes/no): yes

Describe the bug

Adding F1Score to a FCNN model with image input and output, the code fails during initalization with ValueError: shapes must be equal rank, but are 1 and 3 for '{{node AssignAddVariableOp_4}} = AssignAddVariableOp[dtype=DT_FLOAT](AssignAddVariableOp_4/resource, Sum_3)' with input shapes: [], [?,?,4].

I checked and the two tensors y_true and y_pred have the shapes [None, None, None, None] and [None, None, None, 4]. Any other metric handles this just fine (e.g. the native accuracy metric in Keras or another custom metric, that I'm using), which leads me to believe it is a bug in the FBetaScore.

I included a MWE with random data and the same error arises here. EDIT: I also tried running the model in eager mode, which gave a slightly different error. The traceback is posted at the bottom.

Code to reproduce the issue

### filename: toymodel.py ###

import numpy as np
import tensorflow as tf
import tensorflow_addons as tfa
from tensorflow.keras import layers, losses, optimizers

i = layers.Input(shape=(None,None,3), name="input")
n = layers.Conv2D(16, kernel_size=3, padding='same', activation="relu", name="layer1")(i)
n = layers.Conv2D(32, kernel_size=3, padding='same', activation="relu", name="layer2")(n)
o = layers.Conv2D(4, kernel_size=1, padding='same', activation="softmax", name="layer3")(n)

model = tf.keras.models.Model(inputs=i, outputs=o)

model.compile(
    loss=losses.CategoricalCrossentropy(),
    optimizer=optimizers.RMSprop(),
    metrics=[tfa.metrics.F1Score(num_classes=4)],
)
model.summary()
np.random.seed(42)

x = tf.constant(np.random.rand(16,10,10,3), shape=(16,10,10,3))
y = tf.constant(np.eye(4)[np.random.choice(4, (16, 10, 10))], shape=(16,10,10,4))

ds = tf.data.Dataset.from_tensor_slices((x, y)).batch(2)

model.fit(ds, batch_size=2, epochs=4, verbose=2, steps_per_epoch=8)

Other info / logs

Traceback (most recent call last):
  File "/[...]/pyplayground/toymodel.py", line 24, in <module>
    model.fit(ds, batch_size=2, epochs=4, verbose=2, steps_per_epoch=8)
  File "/[...]/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 108, in _method_wrapper
    return method(self, *args, **kwargs)
  File "/[...]/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 1098, in fit
    tmp_logs = train_function(iterator)
  File "/[...]/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)
  File "/[...]/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 823, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
  File "/[...]/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 696, in _initialize
    self._stateful_fn._get_concrete_function_internal_garbage_collected(  # pylint: disable=protected-access
  File "/[...]/python3.8/site-packages/tensorflow/python/eager/function.py", line 2855, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "/[...]/python3.8/site-packages/tensorflow/python/eager/function.py", line 3213, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/[...]/python3.8/site-packages/tensorflow/python/eager/function.py", line 3065, in _create_graph_function
    func_graph_module.func_graph_from_py_func(
  File "/[...]/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 986, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/[...]/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 600, in wrapped_fn
    return weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/[...]/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 973, in wrapper
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

    /[...]/python3.8/site-packages/tensorflow/python/keras/engine/training.py:806 train_function  *
        return step_function(self, iterator)
    /[...]/python3.8/site-packages/tensorflow_addons/metrics/f_scores.py:142 update_state  *
        self.true_positives.assign_add(_weighted_sum(y_pred * y_true, sample_weight))
    /[...]/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py:823 assign_add  **
        assign_add_op = gen_resource_variable_ops.assign_add_variable_op(
    /[...]/python3.8/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py:56 assign_add_variable_op
        _, _, _op, _outputs = _op_def_library._apply_op_helper(
    /[...]/python3.8/site-packages/tensorflow/python/framework/op_def_library.py:742 _apply_op_helper
        op = g._create_op_internal(op_type_name, inputs, dtypes=None,
    /[...]/python3.8/site-packages/tensorflow/python/framework/func_graph.py:591 _create_op_internal
        return super(FuncGraph, self)._create_op_internal(  # pylint: disable=protected-access
    /[...]/python3.8/site-packages/tensorflow/python/framework/ops.py:3477 _create_op_internal
        ret = Operation(
    /[...]/python3.8/site-packages/tensorflow/python/framework/ops.py:1974 __init__
        self._c_op = _create_c_op(self._graph, node_def, inputs,
    /[...]/python3.8/site-packages/tensorflow/python/framework/ops.py:1815 _create_c_op
        raise ValueError(str(e))

    ValueError: Shapes must be equal rank, but are 1 and 3 for '{{node AssignAddVariableOp_2}} = AssignAddVariableOp[dtype=DT_FLOAT](AssignAddVariableOp_2/resource, Sum_2)' with input shapes: [], [10,10,4].

Traceback from eager mode

Traceback (most recent call last):
  File "/[...]/pyplayground/toymodel.py", line 30, in <module>
    model.fit(ds, batch_size=2, epochs=4, verbose=2, steps_per_epoch=8)
  File "/[...]/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 108, in _method_wrapper
    return method(self, *args, **kwargs)
  File "/[...]/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 1098, in fit
    tmp_logs = train_function(iterator)
  File "/[...]/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 806, in train_function
    return step_function(self, iterator)
  File "/[...]/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 796, in step_function
    outputs = model.distribute_strategy.run(run_step, args=(data,))
  File "/[...]/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py", line 1211, in run
    return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
  File "/[...]/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py", line 2585, in call_for_each_replica
    return self._call_for_each_replica(fn, args, kwargs)
  File "/[...]/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py", line 2945, in _call_for_each_replica
    return fn(*args, **kwargs)
  File "/[...]/python3.8/site-packages/tensorflow/python/autograph/impl/api.py", line 275, in wrapper
    return func(*args, **kwargs)
  File "/[...]/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 789, in run_step
    outputs = model.train_step(data)
  File "/[...]/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 759, in train_step
    self.compiled_metrics.update_state(y, y_pred, sample_weight)
  File "/[...]/python3.8/site-packages/tensorflow/python/keras/engine/compile_utils.py", line 409, in update_state
    metric_obj.update_state(y_t, y_p, sample_weight=mask)
  File "/[...]/python3.8/site-packages/tensorflow/python/keras/utils/metrics_utils.py", line 90, in decorated
    update_op = update_state_fn(*args, **kwargs)
  File "/[...]/python3.8/site-packages/tensorflow/python/keras/metrics.py", line 176, in update_state_fn
    return ag_update_state(*args, **kwargs)
  File "/[...]/python3.8/site-packages/tensorflow/python/autograph/impl/api.py", line 258, in wrapper
    raise e.ag_error_metadata.to_exception(e)
tensorflow.python.framework.errors_impl.InvalidArgumentError: in user code:

    /[...]/pyplayground/f_scores.py:145 update_state  *
        self.true_positives.assign_add(_weighted_sum(y_pred * y_true, sample_weight))
    /[...]/python3.8/site-packages/tensorflow/python/ops/resource_variable_ops.py:823 assign_add  **
        assign_add_op = gen_resource_variable_ops.assign_add_variable_op(
    /[...]/python3.8/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py:47 assign_add_variable_op
        _ops.raise_from_not_ok_status(e, name)
    /[...]/python3.8/site-packages/tensorflow/python/framework/ops.py:6843 raise_from_not_ok_status
        six.raise_from(core._status_to_exception(e.code, message), None)
    <string>:3 raise_from
        

    InvalidArgumentError: Cannot update variable with shape [4] using a Tensor with shape [10,10,4], shapes must be equal. [Op:AssignAddVariableOp]

Dec 03 '20 09:12 bthorsted

Hi @bjtho08, thanks for reporting the issue. It seems that we only support 2D y_true and y_pred shaped like [batch_size, num_classes].

Dec 06 '20 04:12 WindQAQ

@WindQAQ do you know what kind of F1 impl we have in TF/Model?

Dec 07 '20 01:12 bhack

@WindQAQ do you know what kind of F1 impl we have in TF/Model?

No, but I remember that they asked us if we can migrate F1 score very long time ago. Not sure if there is any follow-up.

Dec 07 '20 03:12 WindQAQ

@WindQAQ yes, I realized over the weekend as I was working on isolating the issue. I made a local copy, where I reshape y_true and y_pred using tf.reshape(tf.cast(y_true, self.dtype), [-1, self.num_classes]) and similar for y_pred. I was wondering, do you know if it is possible to output several f1 scores to the keras training loop? Currently, if I bypass the tf.reduce_mean(f1_score), the end result is still just a simple mean of the (in my case) 4 numbers returned from the F1Score.result() method. I want to be able to see the F1 score for each class so that I can monitor all classes and make sure that it isn't just one or two of them that are performing well enough to outweigh the poor performance of the remaining classes.

Dec 07 '20 11:12 bthorsted

@WindQAQ do you know what kind of F1 impl we have in TF/Model?

No, but I remember that they asked us if we can migrate F1 score very long time ago. Not sure if there is any follow-up.

Yes at https://github.com/tensorflow/tensorflow/pull/31818

We have also some F1 impl in official/NLP https://github.com/tensorflow/models/search?q=f1

Dec 07 '20 17:12 bhack

Also we had an old thread about multi-class precision recall at https://github.com/tensorflow/addons/issues/1753

Dec 07 '20 17:12 bhack

@WindQAQ yes, I realized over the weekend as I was working on isolating the issue. I made a local copy, where I reshape y_true and y_pred using tf.reshape(tf.cast(y_true, self.dtype), [-1, self.num_classes]) and similar for y_pred. I was wondering, do you know if it is possible to output several f1 scores to the keras training loop? Currently, if I bypass the tf.reduce_mean(f1_score), the end result is still just a simple mean of the (in my case) 4 numbers returned from the F1Score.result() method. I want to be able to see the F1 score for each class so that I can monitor all classes and make sure that it isn't just one or two of them that are performing well enough to outweigh the poor performance of the remaining classes.

I'm afraid that it's impossible to do that if you use model.compile() && model.fit(). It seems that somewhere in https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/engine/training.py reduces the metrics output. One way I can think about is to write a custom callback and print {m.name: m.result() for m in self.metrics}.

https://colab.research.google.com/drive/1G2HQ95iE2lqYub8i3O-kl5_mXMrIwmxg?usp=sharing

Dec 09 '20 02:12 WindQAQ

One way I can think about is to write a custom callback and print {m.name: m.result() for m in self.metrics}.

https://colab.research.google.com/drive/1G2HQ95iE2lqYub8i3O-kl5_mXMrIwmxg?usp=sharing

@WindQAQ I realize this is well beyond the original issue, but how would you go about using that particular solution for batch-wise updates, like the keras progress bar, without losing the progress bar, of course.

Dec 09 '20 14:12 bthorsted