CliqueNet icon indicating copy to clipboard operation
CliqueNet copied to clipboard

Is Nesterov momentum used for ImageNet?

Open ZhuMai opened this issue 7 years ago • 0 comments

There is a sentence in your paper:

We train our models using stochastic gradient descent (SGD) with 0.9 Nesterov momentum and 10-4 weight decay.

But in line 77 in train_imagenet.py, nesterov=True is not set in torch.optim.SGD(). Hence, is Nesterov momentum used in models for ImageNet on earth?

ZhuMai avatar Jun 29 '18 02:06 ZhuMai