embed issues

catboost method to embed categorical variables

11

Hi Emil, I am planning to implement a `step_catboost` (on these [lines](https://github.com/scikit-learn-contrib/category_encoders/blob/master/category_encoders/cat_boost.py)). IMHO, it should belong here. Let me know if you are open for PR?

talegari

feature

step_umap crashing Rstudio

18

Hi Guys, 1- Rstudio is crashing after using "step_umap". I'm getting "R Session Aborted, R encountered a fatal error..." Code: library(recipes) library(dplyr) library(ggplot2) library(embed) recipe(Species ~ ., data = iris)...

mkhansa

bug

reprex

FR: For each of the UMAP clusters, information/ID on values (from which columns) assigned to which UMAP clusters would be nice

4

Scenario: when using `step_umap()` (I guess other methods like `step_knn()` can apply), they are assigned to different clusters. It would be really nice to know the following: 1) Which values...

exsell-jc

Use irlba's truncated SVD to speed up step_pca

6

`step_pca` is very useful, but is slow and memory-intensive when run on more than a few hundred features, even if `num_comp` is much smaller than p. (In my experience this...

dgrtwo

feature

`CppMethod` error when applying prepped UMAP recipe after saving/reading as `.rds`

2

Seems like [there is a bug 🐛 for `step_umap()` when trying](https://stackoverflow.com/questions/68620015/tidymodels-and-embed-error-when-trying-to-bake-prepped-recipe-with-step-umap) to save a prepped recipe as `.rds` and reading it back to apply it new data. ``` r library(tidymodels)...

juliasilge

bug

Use of Weights and offset for models

2

In insurance, the use of weights is very important, it could be important that these steps also include the use of these steps for when categories are joined.

naveranoc

reconsider whether step_rose() should have a seed argument

It became a problem in this case where I couldn't get consistent results even with seeds. https://github.com/tidymodels/tidymodels.org/tree/main/content/learn/models/sub-sampling

EmilHvitfeldt

bug

new parameters for step_lencode_glm

In some situations, when you are modeling counts or GLM it is important to set an offset to the model. It would be interesting if **step_lencode_glm** allowed the user to...

naveranoc

feature

target encoding

step_poisson_matric_factorization()

3

https://cran.r-project.org/web/packages/poismf/index.html

EmilHvitfeldt

feature

Steps idea: Dealing with correlation

1. find the correlation structure 2. find groups of highly correlated features 3. replace each group with the PC of just those features 4. … 5. profit look at correlation...

EmilHvitfeldt

feature

embed
embed copied to clipboard

Metadata

catboost method to embed categorical variables

step_umap crashing Rstudio

FR: For each of the UMAP clusters, information/ID on values (from which columns) assigned to which UMAP clusters would be nice

Use irlba's truncated SVD to speed up step_pca

`CppMethod` error when applying prepped UMAP recipe after saving/reading as `.rds`

Use of Weights and offset for models

reconsider whether step_rose() should have a seed argument

new parameters for step_lencode_glm

step_poisson_matric_factorization()

Steps idea: Dealing with correlation

← Metadata

Owner

Metadata

embed embed copied to clipboard

Metadata

← Metadata

Owner

Metadata

embed
embed copied to clipboard