yardstick icon indicating copy to clipboard operation
yardstick copied to clipboard

New metric request: NSE

Open atsyplenkov opened this issue 2 years ago • 0 comments

Nash-Sutcliffe efficiency (NSE)

Please consider adding the Nash and Sutcliffe Efficiency (NSE) metric. It is highly used in hydrological studies, as it indicates how different the modeled-observed relationship is from the 1:1 line, making it a more robust metric than R².

The NSE values, as described by Nash and Sutcliffe (1970), range from -Inf to 1.

  • An efficiency of 1 (NSE = 1) corresponds to a perfect match between the modeled and observed data.
  • An efficiency of 0 (NSE = 0) indicates that the model predictions are as accurate as the mean of the observed data.
  • An efficiency less than zero (-Inf < NSE < 0) occurs when the observed mean is a better predictor than the model.

Essentially, the closer the model efficiency is to 1, the more accurate the model is.

Reprex

library(tidymodels) # CRAN v1.1.1
library(rlang) # CRAN v1.1.1

# Vector implementation
nse_impl <-
  function(truth,
           estimate,
           case_weights = NULL) {
    1 - (sum((truth - estimate) ^ 2) / sum((truth - mean(truth)) ^ 2))
    
  }

nse_vec <-
  function(truth,
           estimate,
           na_rm = TRUE,
           case_weights = NULL,
           ...) {
    check_numeric_metric(truth, estimate, case_weights)
    
    if (na_rm) {
      result <-
        yardstick_remove_missing(truth, estimate, case_weights)
      
      truth <- result$truth
      estimate <- result$estimate
      case_weights <- result$case_weights
      
    } else if (yardstick_any_missing(truth, estimate, case_weights)) {
      return(NA_real_)
      
    }
    
    nse_impl(truth, estimate, case_weights = case_weights)
    
  }

# Vector Check
data("solubility_test")

nse_vec(truth = solubility_test$solubility,
        estimate = solubility_test$prediction)
#> [1] 0.8789135


# Dataframe implementation
nse <- function(data, ...) {
  UseMethod("nse")
}

nse <-
  new_numeric_metric(nse, direction = "maximize")

nse.data.frame <-
  function(data,
           truth,
           estimate,
           na_rm = TRUE,
           case_weights = NULL,
           ...) {
    numeric_metric_summarizer(
      name = "nse",
      fn = nse_vec,
      data = data,
      truth = !!enquo(truth),
      estimate = !!enquo(estimate),
      na_rm = na_rm,
      case_weights = !!enquo(case_weights)
    )
    
  }

nse(solubility_test, truth = solubility, estimate = prediction)
#> # A tibble: 1 × 3
#>   .metric .estimator .estimate
#>   <chr>   <chr>          <dbl>
#> 1 nse     standard       0.879

Created on 2023-10-03 with reprex v2.0.2

atsyplenkov avatar Oct 02 '23 21:10 atsyplenkov