spatialsample icon indicating copy to clipboard operation
spatialsample copied to clipboard

Updates for current rsample

Open hfrick opened this issue 3 months ago • 1 comments

Closes #166

Some recent-ish changes in rsample are affecting spatialsample: rsample now prohibits LOO and has new type checkers.

The related breakages show up in the revdep tests for testthat, which is supposed to go out in the next week or so.

This PR removes test failures so that we can submit to CRAN (either after we get the 2-week deadline or, ideally, before testthat goes out).

Do you have bandwidth to look at the changes in the snapshots? I think some of the tests use LOO very intentionally to trigger helpful warnings from spatialsample - which might now be dead code. Edit: I've removed such a piece of dead code, the error when trying to do repeated LOO in spatial_vfold_cv(). And I've now seen the buffering.Rmd article error because it tries to use LOO for "leave-one-disc-out cross-validation". Taking this as confirmation that there was intent behind some of the LOO usage and pausing here so you can chip in before I do more :D

hfrick avatar Nov 03 '25 12:11 hfrick

I'll take a look either tonight or tomorrow!

mikemahoney218 avatar Nov 04 '25 10:11 mikemahoney218

Looking into this, I suspect this is going to be a tricky update for spatialsample. As you noticed, we use vfold_cv to enable a few different types of CV, including leave one disc out which has a long history in spatial modeling. Importantly, even though this process starts by creating 1-row test sets, we then expand the test set using the radius argument -- so I think the object shouldn't be subject to any of the same restrictions as LOO CV rsets/rsplits are.

We could potentially check to see if v == nrow(data), use vfold or loo depending on the check result, and then "undo" everything loo_cv does if needed? I don't see a non-hacky way to keep this functionality otherwise. Would rsample be willing to secretly pass prevent_loo in dots to vfold_splits?

mikemahoney218 avatar Nov 06 '25 14:11 mikemahoney218

Does this close #168 by the way? (Thank you!)

mikemahoney218 avatar Nov 06 '25 14:11 mikemahoney218

Ah, I see. Yes, then we should make leave-one-disk-out happen!

I think going via the designated loo_cv() is the better option here. It doesn't require much of a hack since we're re-classing at the end of spatial_buffer_vfold_cv() anyway, we can keep the hack under the hood instead of it also touching the signature of a user-facing function, and we can do it all in spatialsample. I'll open a fresh PR with that!

hfrick avatar Nov 16 '25 17:11 hfrick

Using air for formatting will likely touch more files, I'll send another PR for that! (Just after #170 is merged, that'll make things easier.)

hfrick avatar Nov 16 '25 17:11 hfrick