optax icon indicating copy to clipboard operation
optax copied to clipboard

feat: add `scale_by_gradient_norm`

Open SauravMaheshkar opened this issue 1 year ago • 3 comments

Reference: #594

This PR aims to add a new GradientTransformation which allows to scale by the gradient norm.

Request for Review: @fabianp

SauravMaheshkar avatar May 02 '24 11:05 SauravMaheshkar

CI is green again 😄

SauravMaheshkar avatar May 02 '24 21:05 SauravMaheshkar

Gentle ping: @fabianp

SauravMaheshkar avatar May 07 '24 16:05 SauravMaheshkar

thanks for the ping, will look into this ASAP

fabianp avatar May 16 '24 14:05 fabianp

Hi @SauravMaheshkar ,

a couple more comments after discussing this issue with the dev team:

  • let's promote this to the main package, since normalized GD is a fairly classical method :-)
  • we found the name scale_by_gradient_norm a bit misleading, as it's really scaled by the inverse of the gradient norm. A better name for the method would (IMO) be normalize_by_update_norm.

what do you think about these comments? do they sound reasonable?

fabianp avatar May 31 '24 15:05 fabianp

Hi @SauravMaheshkar ,

a couple more comments after discussing this issue with the dev team:

  • let's promote this to the main package, since normalized GD is a fairly classical method :-)
  • we found the name scale_by_gradient_norm a bit misleading, as it's really scaled by the inverse of the gradient norm. A better name for the method would (IMO) be normalize_by_update_norm.

what do you think about these comments? do they sound reasonable?

Hey @fabianp glad to know there's interest. That sounds reasonable.

Let me do some refactoring. Since it's going into main, I'll put some more effort in the docstrings as well. Will push soon 😄

SauravMaheshkar avatar May 31 '24 17:05 SauravMaheshkar

amazing, thanks!

fabianp avatar Jun 01 '24 11:06 fabianp

@fabianp I've refactored my branch to add the transform to main rather than contrib. Request for Review 😄

SauravMaheshkar avatar Jun 25 '24 11:06 SauravMaheshkar

I've added a doctest 👍🏽 @fabianp

SauravMaheshkar avatar Jun 28 '24 08:06 SauravMaheshkar

looks good to me, but I forgot to ask you one (hopefully) last thing:

please add this new function to the API documentation: https://github.com/google-deepmind/optax/blob/main/docs/api/transformations.rst

thanks! 🙏🏼

fabianp avatar Jun 28 '24 09:06 fabianp

looks good to me, but I forgot to ask you one (hopefully) last thing:

please add this new function to the API documentation: https://github.com/google-deepmind/optax/blob/main/docs/api/transformations.rst

thanks! 🙏🏼

Added in 5c01f6f

SauravMaheshkar avatar Jun 28 '24 09:06 SauravMaheshkar

thanks @SauravMaheshkar !

fabianp avatar Jun 28 '24 12:06 fabianp

duplicated https://github.com/google-deepmind/optax/pull/1000 since I'm having some issues getting this merged due to internal errors

fabianp avatar Jul 08 '24 13:07 fabianp

duplicated #1000 since I'm having some issues getting this merged due to internal errors

Typically copybara I assume 😅

SauravMaheshkar avatar Jul 08 '24 13:07 SauravMaheshkar

yup. Finally merged 🎉

fabianp avatar Jul 09 '24 09:07 fabianp

yup. Finally merged 🎉

Thanks a ton. Feels good to have it merged. Now on to #652

SauravMaheshkar avatar Jul 09 '24 10:07 SauravMaheshkar