Rory Mitchell comments

Results 48 comments of


                                            Rory Mitchell

[BUG] cuml binary classification models do not observe threshold

This is occurring because these binary classification models are stored by treelite as vector leaf models, which are handled differently in postprocessing code and do not account for thresholds.

Rewrite tests

I have updated this so that models are generated using the original code but cached automatically using joblib.memory. Here below is an example of the test matrix (68 items), taking...

[Bug] Wrong output shapes for Shapley values

At this stage I think the path of least resistance is to output the shapley values only for the positive class. This is not ideal because generally we want shapley...

[Bug] Wrong output shapes for Shapley values

Yes this is a good idea. This is currently only a problem for the random forest models. In the case of xgboost we can tell from the output transformation that...

It's often necessary to repeat experiments exactly in ML applications. Specifically I want to use it for quantising data in a distributed setting for an xgboost/lightgbm type algorithm. The default...

Determinism

> I am afraid you did not quite explain why is it necessary to repeat exactly. What happens if it is a bit different? Here is an article that discusses...

Determinism

If you think I would be better off using another type of sketch I would also be interested in suggestions.

Determinism

One problem here is that the mersenne twister prng is quite large I think, and the alternatives are not strong prngs. At the moment I have moved to another solution...

[FEA]: constexpr log functions

I need natural log unfortunately. Is it theoretically possible to use std::is_constant_evaluated() to "overload" for constexpr?

[FEA]: Implement GPU friendly single-threaded sorting algorithms

I was looking for cuda::std::sort in this PR https://github.com/NVIDIA/cccl/pull/6585. I ended up using cuda::std::partial_sort on the full range for a statistical test. Would be great to replace this with proper...