scale parameter unexpected behavior influence the uncertainty estimation

Open pasq-cat opened this issue 2 years ago • 0 comments

hi, i was experimenting with tensorflow probability and i made a simple multilayer perceptron with 3 hidden layers alternated to 3 dropout layers and a final distributional layer that model a gaussian distribution with scale and loc trainable.

`scale=1e-8 fratto=1

def normal_sp(params): return tfd.Normal(loc=params[:,0:1], scale=(scale + tf.math.softplus( params[:,1:2]))/fratto) #loss function def NLL(y, distr): return -distr.log_prob(y)

` the particular structure of the model is fairly standard and not that important, but i just realized that the uncertainty estimate of the neural network heavily depends on the value of the hyperparameter fratto even though there is no reason for it to be so fundamental. the neural network should update the value params[:,1:2] accordingly whether i set fratto to 1 or 2 or 3 and reach the same result, instead it doesn't happen and with fratto = 1 i get much larger standard deviation than what the data suggests. With fratto=2 instead i get better results with a lower level of miscalibration. is it normal that one needs to fine tune the scale parameter?

Oct 13 '23 04:10 pasq-cat