yxdr
Results
1
issues of
yxdr
loss = F.nll_loss(output[1:].view(-1, vocab_size), trg[1:].contiguous().view(-1), ignore_index=pad) The loss computed by the above line is the average at every time step, which can cause it difficult to train the model. So...