smartcore icon indicating copy to clipboard operation
smartcore copied to clipboard

Random forest parallelization

Open daraghk opened this issue 3 years ago • 3 comments

daraghk avatar Jul 09 '22 15:07 daraghk

@daraghk , thank you very much for your contribution to Smartcore! The code looks good, the only thing that is left are clippy warnings and better test coverage. Once you have closed all pending checks we can merge this PR to development

VolodymyrOrlov avatar Jul 14 '22 01:07 VolodymyrOrlov

@VolodymyrOrlov happy to contribute! I've fixed the code coverage issue but the linting errors are unrelated to any code I touched - you can see the same issues in the other random forest pr which is open at the moment.

daraghk avatar Jul 14 '22 11:07 daraghk

Codecov Report

Merging #137 (e9da303) into development (b4a807e) will increase coverage by 0.00%. The diff coverage is 85.96%.

@@             Coverage Diff              @@
##           development     #137   +/-   ##
============================================
  Coverage        83.40%   83.41%           
============================================
  Files               78       79    +1     
  Lines             8377     8453   +76     
============================================
+ Hits              6987     7051   +64     
- Misses            1390     1402   +12     
Impacted Files Coverage Δ
src/linalg/ndarray_bindings.rs 83.18% <ø> (ø)
src/math/num.rs 84.09% <ø> (ø)
src/ensemble/random_forest_regressor.rs 73.10% <84.09%> (+2.51%) :arrow_up:
src/ensemble/random_forest_classifier.rs 75.12% <86.56%> (+4.09%) :arrow_up:
src/dataset/digits.rs 92.85% <100.00%> (+0.54%) :arrow_up:
src/svm/svc.rs 89.45% <0.00%> (-0.31%) :arrow_down:
src/linear/lasso_optimizer.rs 94.11% <0.00%> (ø)
src/lib.rs 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update b4a807e...e9da303. Read the comment docs.

codecov-commenter avatar Jul 14 '22 11:07 codecov-commenter

Sorry to say this, parallelization is not a characteristic we want to push at the moment as Wasm support is prioritized. We are indeed looking for partial (kernel-, column-, row-dependant) computation but at the matrix operations level at the beginning. Rayon is a great library but at the moment we cannot guarantee wider adoption at codebase level, if there are news on making rayon works with all the targets we want to support I would be glad to reopen this.

@daraghk please take a look to the major changes currently in development (heading to release 0.4), it would be great to have more contributions from you.

Mec-iS avatar Oct 31 '22 18:10 Mec-iS

I think that we could do parallelization with a feature flag that is disabled for WASM targets

morenol avatar Oct 31 '22 18:10 morenol

I know but we have no bandwidth to follow up on all these features, creating a feature for one case can be only an increase in complexity with very few benefits. We need to prioritise or allocate more time or have somebody that carries on the task of extending parallelization to more algorithms.

Mec-iS avatar Oct 31 '22 19:10 Mec-iS