xeofs icon indicating copy to clipboard operation
xeofs copied to clipboard

Add option `n_modes="all"` to perform the full decomposition

Open nicrie opened this issue 1 year ago • 2 comments

As an aside it would be nice to have an option like n_modes="all", which figures out the rank of the data early on and sets n_modes = rank. When I want to do experiments like this I usually just first try to fit the model with a million modes and then get the rank from the error message and plug it in.

Looks like sklearn.PCA does this as a default: n_components = None

Originally posted by @slevang in https://github.com/xarray-contrib/xeofs/issues/156#issuecomment-1986889963

nicrie avatar Mar 09 '24 17:03 nicrie

Agree with you @slevang , that'd be nice. Currently, the rank is only computed when we fit the Decomposer

https://github.com/xarray-contrib/xeofs/blob/a3cb204171dbbd1f09f2c083099cf3b67f625495/xeofs/models/decomposer.py#L72-L74

The first opportunity to compute the rank is probably already within the Stacker of the Preprocessor. since we then know the final size of the matrix.

I don't know what you think about the default value - Personally, I mostly prefer to have a fast rather than exact decomposition. If we were going for n_modes="all" as a default, it would always trigger the full SVD decomposition.

nicrie avatar Mar 09 '24 17:03 nicrie

Yep, agree with all that. Fine to leave the default alone, and I was thinking around the same place in the preprocessors to check the rank. Just a question then of how to propagate the information back to the rest of the class, haven't looked at the details yet.

slevang avatar Mar 09 '24 17:03 slevang