evaluate support multilabel confusion matrix

using sklearn multilabel_confusion_matrix

sample usage:

confusion_metric  = evaluate.load("confusion_matrix", config_name="multilabel")


y_true= np.array([[0, 0, 0, 0, 1], [1, 0, 1, 0, 0], [0, 0, 1, 0, 1], [1, 0, 0, 0, 0]])
y_pred= np.array([[0, 1, 0, 0, 1], [0, 0, 1, 0, 0], [0, 0, 1, 1, 1], [1, 0, 1, 0, 1]])

confusion_metric.compute(references= y_true, predictions= y_pred)

output:

{'confusion_matrix': array([[[2, 0],
         [1, 1]],
 
        [[3, 1],
         [0, 0]],
 
        [[1, 1],
         [0, 2]],
 
        [[3, 1],
         [0, 0]],
 
        [[1, 1],
         [0, 2]]])}

Jan 05 '24 14:01 0ssamaak0

@lvwerra

Jan 06 '24 15:01 0ssamaak0

There was a problem with file formatting, I reformatted it and comitted the new code

Jan 11 '24 12:01 0ssamaak0

There's an error in tests, I can't understand it, it seems to be unrelated with the code 😅 sorry for inconvenience

Jan 11 '24 14:01 0ssamaak0

Overall LGTM, but I wonder if it should be a separate metric to keep 1:1 mapping to sklearn

I'm not sure 100% what is the optimal way, but I think the main idea behind evaluate is to be an abstraction.

anyway, if you think if it's better to be in a seperate module I can do it.

thank you ☺️

Jan 11 '24 14:01 0ssamaak0

@osanseviero will you accept the request? I've done the suggested edites

Jan 13 '24 18:01 0ssamaak0

thanks for rerunning the workflow, but I don't understand what's the problem 😬

Jan 13 '24 19:01 0ssamaak0

I fixed the problem in docstring example that causes unit test failure, can you run the test and merge please? @lvwerra thank you

Mar 18 '24 15:03 0ssamaak0