Michael Carilli comments

Results 26 comments of


                                            Michael Carilli

ImportError: cannot import name 'amp'

There may still be some python import weirdness going on. Try moving apex out of your training directory hierarchy, to a completely different location.

cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

This might be a cudnn issue, especially if you're using cudnn 7.2. Try ``` >>> import torch >>> torch.backends.cudnn.version() ``` Upgrading your cudnn version may fix it: https://github.com/NVIDIA/apex/issues/78#issuecomment-440301134 Container options...

cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

> I tried updating and unfortunately the error persists. The command you mentioned outputs 7401. @pancho111203 Since you've got cuda 10 on bare metal (meaning your system has the cuda...

torch.cuda.amp > apex.amp

Yes. `torch.cuda.amp.autocast` can be enabled wherever you want and affects only ops invoked within enabled regions. `autocast` and `torch.cuda.amp.GradScaler` are modular codewise. During training, you should use both (`autocast` selects...

torch.cuda.amp > apex.amp

@Damiox `torch.cuda.amp.autocast` is similar to O1 in that it casts function inputs on the fly without touching model weights. However, unlike apex O1, `autocast` only causes casting behavior in regions...

torch.cuda.amp > apex.amp

@vince62s `apex.optimizers.FusedAdam` and `torch.optim.Adam` should both work out of the box with native Amp following the [documented control flow](https://pytorch.org/docs/master/notes/amp_examples.html#typical-mixed-precision-training) (create model in default precision aka fp32). If you also need...