pytorch-optimizer icon indicating copy to clipboard operation
pytorch-optimizer copied to clipboard

GPU memory leak in adahessian optimizer?

Open sjscotti opened this issue 4 years ago • 3 comments

Hi I am using your library and appreciate all the work you have put into this capability. I started using the adahessian optimizer and found that my GPU memory would increase until it used all my GPU memory and the run crashed as the optimizer operated. The leak seems to be within the get_trace routine and I believe it is can be fixed by changing

      hvs = torch.autograd.grad(
            grads, params, grad_outputs=v, only_inputs=True, retain_graph=True
        )

to

      hvs = torch.autograd.grad(
            grads, params, grad_outputs=v, only_inputs=True, retain_graph=False
        )

If you get a chance to check this out, please comment to let me know. Thanks!

sjscotti avatar Sep 08 '21 17:09 sjscotti

@sjscotti Would you like to submit PR with proposed fix? Could be counted as part of hactoberfest.

jettify avatar Oct 02 '21 15:10 jettify

Thanks for the suggestion, but I don't have experience in submitting pull requests. I did give it a try but I was stuck at the first step (comparing branches).
BTW, you might also add a @torch.no_grad() decorator before each routine in adahessian. I saw that done for some other implementations of adahessian (and there may be other optimizers in your library that could also use this decorator).

sjscotti avatar Oct 02 '21 16:10 sjscotti

Yep i have plan to add @torch.no_grad(), hopefully will find time do to this soon.

jettify avatar Oct 08 '21 13:10 jettify