embed icon indicating copy to clipboard operation
embed copied to clipboard

step_poisson_matric_factorization()

Open EmilHvitfeldt opened this issue 2 years ago • 3 comments

https://cran.r-project.org/web/packages/poismf/index.html

EmilHvitfeldt avatar Oct 01 '23 22:10 EmilHvitfeldt

I read about Gamma-Poisson factorization for single categorical columns in Patricio Cerda, Gaël Varoquaux. Encoding high-cardinality string categorical variables. 2019. (analogous to solving the following non-negative matrix factorization (NMF) with the generalized Kullback-Leibler divergence)

The paper includes an interesting online algorithm. Image

Does it make sense to use {reticulate} with this python implementation? https://skrub-data.org/stable/reference/generated/skrub.GapEncoder.html

jrosell avatar Dec 10 '24 14:12 jrosell

without having looked at the documentation, i lean on the side of translating the method to R rather than using {reticulate}. Purely on the basis of developer burden. Using {reticulate} in a package is already not the best experience, and then you have to worry about breaking changes from the python implementation etc etc.

EmilHvitfeldt avatar Dec 10 '24 19:12 EmilHvitfeldt