AdaBound
AdaBound copied to clipboard
When did the optimizer switch to SGD?
I set the initial lr=0.0001, final_lr=0.1, but I still don't know when the optimizer will become SGD. Do I need to improve my learning rate to the final learning rate manually? thanks!
There is no hard switch, but instead it is a smooth transition between the behavior of Adam and SGD.