zhhao1

Results 5 comments of zhhao1

你好,这个问题解决了吗

我找到解决办法了,我用单个GPU把这个打印出来next(self.parameters()).dtype, 都是torch.float32,应该就是版本问题。直接替换掉就可以了

My experience: model.half() adam(eps=1e-8) loss:nan model.half() sgd loss:normal, however, non convergence model.half() adam(eps=1-4) loss:normal, however, non convergence model.half() fp16 loss:normal, however, non convergence model adam(eps=1e-8) loss:normal, convergence Remove .half() can...

> > My experience: model.half() adam(eps=1e-8) loss:nan model.half() sgd loss:normal, however, non convergence model.half() adam(eps=1-4) loss:normal, however, non convergence model.half() fp16 loss:normal, however, non convergence model adam(eps=1e-8) loss:normal, convergence Remove...

> The first release of Distil-Whisper will be for English. We'll be releasing training code next week to facilitate anyone in the community to distill Whisper on their choice of...