piotrszul

Results 73 issues of piotrszul

This is the remaining work from issue: #140 That is: - Add command line option for predicing class probabilities - Implementing command line predictions form JSON serialised model - Adding...

Refactor additive handling of paths like 'spark.jars` in hails configuration.

techdebt

Align code style check for Scala, Java and Python with InteliJ formatting.

techdebt

Write a concise user manual using sphinx and make is available at readthedocs. The audience should be primarily bioinformaticians. The scope should coverL * installing variant spark * running importance...

The procedure of selecting split variables in case of equal reduction in impurity is slightly biased towards variables with larger indexes. In the previous non-reproducible approach it was casused by...

Some ideas to consider for improved performance: * splits coming form a singel variable are likely to be very sparse -> as such it may not make sense to return...

enhancement

This is noticeable by comparing runtime on sparse vs dense synthetic regression datasets. The sparse ones run much slower although intuitively they should run faster.

Make is somehow possible to group tests based on the spark context then need. Currently only one context is possible for all tests, while three different context are needed -...

techdebt