TX
Results
1
issues of
TX
Subtracting max_y from all y makes the output of softmax harder to overflow.