cakedev0

Results 45 comments of cakedev0

For models based purely on Collaborative Filtering (as all the models of this package), users who only have had one product interaction are useless in the training set. More precisely,...

This PR looks great, what's preventing it from being merged? Would a second review from myself allow it to be merged?

Sorry in the end I won't have time to review this for now. Maybe in one or two weeks.

Note: This issue will be fixed by PR #32119, no additional PR is needed.

I don't think histogram-based splitting is suited for random forests because **it's not adapted for deep trees.** Here is a small benchmark comparing splitting speed for HGB and random forests...

I haven't though about using a `Pipeline` for that, but that's exactly the idea. I was just thinking about implementing that inside RFs and adapting the code downstream (using `dtype=uint8`...

Thanks a lot for the helpful insights! Dynamic histogramming sounds promising! Do you combine with multi-threading? (asking this because for small leaves, we also have a problem with multi-threading in...

Some thoughts: **binning + MAE criterion:** - I'm pretty sure histogram based split is not compatible with MAE criterion (try to think about how computing the median in a "histogram-way",...

> > I'm pretty sure histogram based split is not compatible with MAE criterion > > I don’t think this is an obstacle. The features X are binned, not the...

I think commit https://github.com/scikit-learn/scikit-learn/commit/4b456682125d74553ee51c7ba961d4fe4c425b70 first introduced this class, under the name `GBM_MSE`.