Roc icon indicating copy to clipboard operation
Roc copied to clipboard

Methods for individual performance measures

Open afbarnard opened this issue 12 years ago • 1 comments

All the common performance measures should be available as methods (as long as they can be calculated from a confusion matrix and/or a ranking). They may apply to a given threshold (e.g. accuracy) or apply across thresholds (e.g. average precision). Have methods for each one, but call attention in the documentation to which ones are aliases of others.

Common performance measures to be included:

  • accuracy
  • precision
  • recall
  • sensitivity
  • specificity
  • true positive rate
  • false positive rate
  • positive predictive value
  • negative predictive value
  • average precision

Would there be a way to make a general aggregate/average function that could take each method with a threshold? For example, the precision method plus the average aggregate would yield average precision.

afbarnard avatar Mar 29 '13 16:03 afbarnard

What are the use cases for the general aggregation besides average precision?

My first thought is the subtleties of different scores would make a general aggregator more dangerous than useful. For example, average precision should only use the thresholds which correspond to a positive example (with added complications if there are ties in the ranking). But I imagine other aggregations would have other other selection criteria. My guess is to calculate ROC area via an aggregator like this you would need true positive rate for each threshold corresponding to a negative example.

Since most of the threshold scores mentioned are not used in any aggregation scores I'm familiar with, I think it's more understandable and usable if there are specific functions for average precision and any other aggregate scores that are useful.

kboyd avatar Apr 02 '13 00:04 kboyd