DualStudent
DualStudent copied to clipboard
Loss value equals to NAN!
Thx for your marvelous work! I am trying to use your method as my baseline, but I find that if I set the epoch larger than 300(which is set in your script originally), after the 200 epoch, the loss value will be NAN. I cannt figure it out what's wrong, do you have any idea about that? Thx!
Hi, thanks for your attention. This sounds strange. I tried larger epochs (like 600 or 1200) before. Can you check the learning rate after 200 epochs?