AdaBound issues

When did the optimizer switch to SGD？

1

I set the initial lr=0.0001, final_lr=0.1, but I still don't know when the optimizer will become SGD. Do I need to improve my learning rate to the final learning rate...

yunbujian

Pytorch 1.6 warning

1

``` /home/xxxx/.local/lib/python3.7/site-packages/adabound/adabound.py:94: UserWarning: This overload of add_ is deprecated: add_(Number alpha, Tensor other) Consider using one of the following signatures instead: add_(Tensor other, *, Number alpha) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)...

MichaelMonashev

Learning rate changing

Hi， thanks a lot for sharing your excellent work. I wonder if I want to change learning rate with epoch increasing, how do I set parameter **lr** and **final_lr** in...

EddieEduardo

Nan loss in RCAN model

9

https://github.com/wayne391/Image-Super-Resolution/blob/master/src/models/RCAN.py Just change `optimizer = torch.optim.Adam(model.parameters(), lr=1e-4, amsgrad=False)` to `optimizer = adabound.AdaBound(model.parameters(), lr=1e-4, final_lr=0.1)` Nan loss in RCAN model, but Adam work fine.

Ken1256

About clip (α / √Vt, ηl, ηu) in the paper

Hello, can you please tell me what these two parameters in α / √Vt mean, especially Vt? Thank you

jixiedy

The provided new optimizer is sensitive on tiny batchsize

4

The provided new optimizer is sensitive on tiny batchsize (

GreatGBL

LSTM hyparameters for language modeling

Greetings, Thanks for your great paper. I am wondering about the hyperparameters you used for language modeling experiments. Could you provide information about that? Thank you!

hoangcuong2011

Merge with Ranger or over9000

https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer https://github.com/mgrankin/over9000/issues/4 I strongly believe that AdaBound would be better if it used RAdam instead of Adam. It could merge with Lookahead too and LAMB. Then we would have the...

LifeIsStrange

grammar police... "as well as adam"

correct grammar would be "as well as adam" not sure if you care

bionicles

Be careful when using adaptive gradient methods

3

![camp](https://user-images.githubusercontent.com/34268595/57842253-bb57c200-77fe-11e9-9a3d-59e34471eb35.png) I tested three methods in a very simple problem, and got the result as above. Code are printed here: import torch import torch.nn as nn import matplotlib.pyplot as plt...

stevenyangyj

AdaBound
AdaBound copied to clipboard

Metadata

When did the optimizer switch to SGD？

Pytorch 1.6 warning

Learning rate changing

Nan loss in RCAN model

About clip (α / √Vt, ηl, ηu) in the paper

The provided new optimizer is sensitive on tiny batchsize

LSTM hyparameters for language modeling

Merge with Ranger or over9000

grammar police... "as well as adam"

Be careful when using adaptive gradient methods

← Metadata

Owner

Metadata

AdaBound AdaBound copied to clipboard

Metadata

← Metadata

Owner

Metadata

AdaBound
AdaBound copied to clipboard