themis icon indicating copy to clipboard operation
themis copied to clipboard

Add option to exclude columns in `step_nearmiss()`

Open 3styleJam opened this issue 7 months ago • 0 comments

I have an existing recipe with numerous steps and I tried adding step_nearmiss(response_var, under_ratio = 2, skip = TRUE, seed = 8237), but then when I started the tuning process I receive multiple instances of:

→ A | error:   Error in `step_nearmiss()`:
               Caused by error in `prep()`:
               ✖ All columns selected for the step
                 should be double or integer.

This is because I have many factor variables and the mean distance of the neighbours to each point in the majority class can only be calculated with numeric variables. I looked at the help page for step_nearmiss() and there doesn't seem to be an argument to select the columns on which to base the distance calculation. Please could this be added, or alternatively an argument to allow the exclusion of columns? I would prefer not to have to change all my factor variables to numeric and back again for this step, as I've already dealt with the data types earlier in my processing. Thank you.

3styleJam avatar Jun 25 '25 12:06 3styleJam