keras2cpp icon indicating copy to clipboard operation
keras2cpp copied to clipboard

Skip dropout rate error

Open jvwilliams23 opened this issue 3 years ago • 3 comments

Hi,

I have noticed a potential issue in the following code: https://github.com/pplonski/keras2cpp/blob/ce407cc06ca9886c330c1bf0e152058befcb60bb/keras_model.cc#L431-L433

Are you sure that we do not need to include dropout layer in prediction mode? In Figure 2 of Srivastava et al. (2014), they say that in training, the weights are randomly set to 0 with probably equal to the dropout rate. In prediction mode, the dropout rate is still there but is simply multiplied to all weights in the layer - which disagrees with the code.

Additionally, I have noticed major differences in my python keras models vs keras2cpp models with dropout when using the default keras_model.cc. Then, when the weights are multiplied by dropout rate, the error goes away.

Reference Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R., 2014. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1), pp.1929-1958.

jvwilliams23 avatar Jul 23 '22 09:07 jvwilliams23

@jvwilliams23 when I was testing the code the predictions were exactly the same as in keras. I dont think that dropout is used in prediction.

pplonski avatar Jul 23 '22 15:07 pplonski

@pplonski Interesting, I will look more into my code. Were you testing using the mnist example?

jvwilliams23 avatar Jul 27 '22 08:07 jvwilliams23

Yes, with mnist data.

pplonski avatar Jul 27 '22 12:07 pplonski

Hi @pplonski I just got around to looking into this further. It seems keras do not use dropout in prediction (https://github.com/keras-team/keras/blob/dc95ceca57cbfada596a10a72f0cb30e1f2ed53b/keras/layers/core.py#L116). I guess this is consistent then, but strange that it goes against the original paper. I will close this issue.

jvwilliams23 avatar Dec 20 '22 09:12 jvwilliams23