Rory Mitchell

Results 48 comments of Rory Mitchell

This is occurring because these binary classification models are stored by treelite as vector leaf models, which are handled differently in postprocessing code and do not account for thresholds.

I have updated this so that models are generated using the original code but cached automatically using joblib.memory. Here below is an example of the test matrix (68 items), taking...

At this stage I think the path of least resistance is to output the shapley values only for the positive class. This is not ideal because generally we want shapley...

Yes this is a good idea. This is currently only a problem for the random forest models. In the case of xgboost we can tell from the output transformation that...

It's often necessary to repeat experiments exactly in ML applications. Specifically I want to use it for quantising data in a distributed setting for an xgboost/lightgbm type algorithm. The default...

> I am afraid you did not quite explain why is it necessary to repeat exactly. What happens if it is a bit different? Here is an article that discusses...

If you think I would be better off using another type of sketch I would also be interested in suggestions.

One problem here is that the mersenne twister prng is quite large I think, and the alternatives are not strong prngs. At the moment I have moved to another solution...

I need natural log unfortunately. Is it theoretically possible to use std::is_constant_evaluated() to "overload" for constexpr?

I was looking for cuda::std::sort in this PR https://github.com/NVIDIA/cccl/pull/6585. I ended up using cuda::std::partial_sort on the full range for a statistical test. Would be great to replace this with proper...