Methods for individual performance measures

Open afbarnard opened this issue 12 years ago • 1 comments

All the common performance measures should be available as methods (as long as they can be calculated from a confusion matrix and/or a ranking). They may apply to a given threshold (e.g. accuracy) or apply across thresholds (e.g. average precision). Have methods for each one, but call attention in the documentation to which ones are aliases of others.

Common performance measures to be included:

accuracy
precision
recall
sensitivity
specificity
true positive rate
false positive rate
positive predictive value
negative predictive value
average precision

Would there be a way to make a general aggregate/average function that could take each method with a threshold? For example, the precision method plus the average aggregate would yield average precision.

Mar 29 '13 16:03 afbarnard

What are the use cases for the general aggregation besides average precision?

My first thought is the subtleties of different scores would make a general aggregator more dangerous than useful. For example, average precision should only use the thresholds which correspond to a positive example (with added complications if there are ties in the ranking). But I imagine other aggregations would have other other selection criteria. My guess is to calculate ROC area via an aggregator like this you would need true positive rate for each threshold corresponding to a negative example.

Since most of the threshold scores mentioned are not used in any aggregation scores I'm familiar with, I think it's more understandable and usable if there are specific functions for average precision and any other aggregate scores that are useful.

Apr 02 '13 00:04 kboyd