Gryff1ndor
Results
1
issues of
Gryff1ndor
In your formula (the image below), it seems that the log[π(y|x)] was calculate through .sum(-1) after logits.softmax(-1), then .log().  But in your codes (the image below), the log[π(y|x)] was...