bootstrapped icon indicating copy to clipboard operation
bootstrapped copied to clipboard

Memory leak

Open ofir-reich opened this issue 8 years ago • 1 comments

Thanks for the package! Looks really useful for A/B tests that I'm running.

I tried to run your example code I executed the following line: print(bs.bootstrap_ab(test, ctrl, bs_stats.mean, bs_compare.percent_change)) and it consumed a few GB of memory before having an exception of MemoryError.

I'm on Ubuntu 16.04, python 2.7, tell me any other specs you need. Any guess what is going wrong?

ofir-reich avatar Jan 22 '18 09:01 ofir-reich

This is very good feddback - thank you! @ofir-reich memory size is definitely the problem. I will update the example to require less memory (to be more user friendly).

Have you seen this example? https://github.com/facebookincubator/bootstrapped/blob/master/examples/large_data.ipynb

Here I go over how to handle this specific problem (make iteration_batch_size smaller).

The default behavior for the library is to create matricies of num_bootstraps*input_array_size. num_bootstraps=10000 and input array size in the example are 10k and 50k each (so there are two large arrays that get created). However, I give the option to chunk this operation into matrices of size iteration_batch_size. This trades execution speed for memory reduction (not too bad in practice unless you make the batch size very small).

Can you give that a shot and let me know how it goes? Thanks!

spencebeecher avatar Jan 25 '18 16:01 spencebeecher