yxdr

Results 1 issues of yxdr

loss = F.nll_loss(output[1:].view(-1, vocab_size), trg[1:].contiguous().view(-1), ignore_index=pad) The loss computed by the above line is the average at every time step, which can cause it difficult to train the model. So...