Much faster versions of PCA + UMAP exist. Can we implement them?
Intel-sklearn, CuML, and several other libraries should have optimized variants of PCA and likely a few other algorithms used. I can submit a PR implementing a few of these if you'd like. For certain types of vectors, this can cause a noticeable speedup.
GPU implementations might harm reproducibility. Might be other issues too that I haven't thought about. Thoughts?
I'd be interested in a PR--please implement them as a new method like the existing umap for now if you do. I'd be especially interested in a speed comparison!
GPU implementations might harm reproducibility
so does setting n_jobs>=1 so i'd say there's definitely a tradeoff for performance vs repro
Btw this exists:
from sklearnex import patch_sklearn
patch_sklearn(global_patch=True)
import sklearn
Source: https://uxlfoundation.github.io/scikit-learn-intelex/latest/global-patching.html