Donal Lowsley-Williams
Donal Lowsley-Williams
Seeing improved performance with a significantly reduced amount of partitions. Before was working with 32-650 partitions on my data, been experimenting with only 1-2 and seeing much better results
FYI Yungcero and I are working on the same project. Num Partitions: for the tests below, we used the default partition count of around ~330. We experimented with low partitions...
Matching partition now to executor (just 1 partition then?) and testing again, will post updates here
Here is output with just 1 partition, 20 millions rows: ``` 23/08/02 18:49:44 WARN TaskSetManager: Lost task 0.0 in stage 32.1 (TID 1046) (10.244.2.3 executor 1): java.lang.NullPointerException at com.microsoft.azure.synapse.ml.lightgbm.NetworkManager$.parseExecutorPartitionList(NetworkManager.scala:178) at...
Thanks for the tip on the autoscaler. We turned it off and saw improvements in stability, but still facing the NullRef. I can confirm this is with the snapshot version....
Here is a PasteBin Link for the same exact run, but repartitioning into 3 partitions and with 3 workers set. https://pastebin.com/1xHdWjJ1