seq2seq icon indicating copy to clipboard operation
seq2seq copied to clipboard

A problem with loss computation.

Open yxdr opened this issue 6 years ago • 1 comments

loss = F.nll_loss(output[1:].view(-1, vocab_size), trg[1:].contiguous().view(-1), ignore_index=pad)

The loss computed by the above line is the average at every time step, which can cause it difficult to train the model. So I suggest accumulating the loss at every time step. In my experiments, this makes it easier to train the model.

yxdr avatar Dec 07 '19 05:12 yxdr

loss = F.nll_loss(output[1:].view(-1, vocab_size), trg[1:].contiguous().view(-1), ignore_index=pad)

The loss computed by the above line is the average at every time step, which can cause it difficult to train the model. So I suggest accumulating the loss at every time step. In my experiments, this makes it easier to train the model.

so ,how to write the loss?

fengxin619 avatar Jun 21 '21 07:06 fengxin619