sam
sam copied to clipboard
How to use SAM with torch.cuda.amp.GradScaler
How can you combine SAM with GradScaler and gradient clipping, because you can't unscale twice.
I am also struggling with that right now. I have tried :
- Defining a closure function as global and calling scaler.step(optimizer) -> Fails
- Pass closure function as kwarg to scaler.step -> Failed
- Double .unscale_ -> Failed
If there is a work around, it would be very useful sharing it with us
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.