embed
embed copied to clipboard
Extra recipes for predictor embeddings
Hi Emil, I am planning to implement a `step_catboost` (on these [lines](https://github.com/scikit-learn-contrib/category_encoders/blob/master/category_encoders/cat_boost.py)). IMHO, it should belong here. Let me know if you are open for PR?
Hi Guys, 1- Rstudio is crashing after using "step_umap". I'm getting "R Session Aborted, R encountered a fatal error..." Code: library(recipes) library(dplyr) library(ggplot2) library(embed) recipe(Species ~ ., data = iris)...
Scenario: when using `step_umap()` (I guess other methods like `step_knn()` can apply), they are assigned to different clusters. It would be really nice to know the following: 1) Which values...
`step_pca` is very useful, but is slow and memory-intensive when run on more than a few hundred features, even if `num_comp` is much smaller than p. (In my experience this...
Seems like [there is a bug 🐛 for `step_umap()` when trying](https://stackoverflow.com/questions/68620015/tidymodels-and-embed-error-when-trying-to-bake-prepped-recipe-with-step-umap) to save a prepped recipe as `.rds` and reading it back to apply it new data. ``` r library(tidymodels)...
In insurance, the use of weights is very important, it could be important that these steps also include the use of these steps for when categories are joined.
It became a problem in this case where I couldn't get consistent results even with seeds. https://github.com/tidymodels/tidymodels.org/tree/main/content/learn/models/sub-sampling
In some situations, when you are modeling counts or GLM it is important to set an offset to the model. It would be interesting if **step_lencode_glm** allowed the user to...
https://cran.r-project.org/web/packages/poismf/index.html
1. find the correlation structure 2. find groups of highly correlated features 3. replace each group with the PC of just those features 4. … 5. profit look at correlation...