Andrej Erkelens
Andrej Erkelens
@scottjlee following up on this - my perf for tabular data loading (arrays of 12K float32s) goes from about 200s for 1 epoch to 1200s for 1 epoch when I...
@scottjlee no problem. i am using file based shuffling what about if you perform a `map_batches(shuffle_func, batch_size=None)` on the dataset? is it possible you could get better shuffle performance -...
@jmalkin Hi - to circle back on this topic I have found some very minor non determinism when merging kll doubles sketches in this library. However when I read the...
@AlexanderSaydakov You are right. I can open a new issue if needed. I can try a few more samples -> but the same sketches I merged in python (cpp) which...