Loss function

Open dkirkby opened this issue 6 years ago • 0 comments

I am having trouble understanding your loss function defined here as:

pre_output = self.layers[-1].lin_output
log_prob = -T.sum(T.nnet.softplus(-target * pre_output + (1 - target) * pre_output), axis=1)
loss = (-log_prob).mean()

It looks like the softplus arg simplifies to 1 - 2 * target * pre_output, but does this form have better numerics? Why is the softplus used here?

How does this loss relate to eqn (5) of your paper, which looks like a standard binary cross entropy?

Mar 23 '19 19:03 dkirkby