DSB2017 icon indicating copy to clipboard operation
DSB2017 copied to clipboard

traincal_classifier的clip_grad_norm的问题

Open FancccyRay opened this issue 6 years ago • 1 comments

廖老师您好,我在你的论文里面读到 The training procedure is quite unstable because the batch size is only 2 per GPU, and there are many outliers in the training set. Gradient clipping is therefore used in a later stage of training, i.e. if the norm of the gradient vector is larger than one, it would be normalized to one. 我的理解是在训练的后期阶段使用梯度裁剪,即如果梯度参数的范数大于1,则将其归一化为1。

我只在traincal_classifier中找到clip_grad_norm的(model.parameters(),1),而这一句被注释掉了,请问是要自己设置到后期(另外,请问后期是指什么时候?)的时候再用这一句,还是说代码不小心给注释错了?

FancccyRay avatar Sep 25 '19 13:09 FancccyRay

@FancccyRay @lfz Have you solve this problem? I notice a relative answer in #21 .When shoul we use clip_grad_norm?

lmz123321 avatar Sep 24 '20 01:09 lmz123321