chenjc

Results 6 comments of chenjc

I want to know have you reproduced the results of transformer reported in the paper

您好,我已收到您的邮件,请您知悉。

您好,我已收到您的邮件,请您知悉。

@xingyuma618 And do you understand why the authors use torch.sum(nll, dim=1) to calculate log-likelihood, not torch.mean(nll, dim=1). I think this is unlike the calculation method described in the article