optax icon indicating copy to clipboard operation
optax copied to clipboard

Adan Optimizer

Open joaogui1 opened this issue 3 years ago • 8 comments

Is there any interest in adding the Adan optimizer to optax? If so I can do it

joaogui1 avatar Aug 24 '22 12:08 joaogui1

Interesting, I think that would be great! Thanks a lot!

Let us know if you'd like to discuss anything about the implementation as you write it. How much of the Adam code do you think can be reused?

mkunesch avatar Aug 29 '22 00:08 mkunesch

This looks very interesting, it would be an amazing contribution!

mtthss avatar Aug 30 '22 13:08 mtthss

I see there is an issue with the replicability of the pull request. It appears there is another implementation in jax for optax here, which might be worth looking at.

https://github.com/hr0nix/optax-adan

adam-hartshorne avatar Sep 12 '22 11:09 adam-hartshorne

Thanks for the pointer @adam-hartshorne! Sadly when testing that implementation on my colab it gets a 3 order of magnitude larger error compared to mine (not sure where the difference comes from, the only difference I can see is the epsilon placement)

joaogui1 avatar Sep 12 '22 14:09 joaogui1

Hi, author of optax-adan here. Is it possible to share a collab where you've compared both implementations? I'd like to figure out where does the difference come from.

hr0nix avatar Sep 18 '22 15:09 hr0nix

Oh, found the link in the pull request, so no worries.

hr0nix avatar Sep 18 '22 15:09 hr0nix

Yep, the difference between implementations comes from epsilon placement. If I move it outside sqrt, the results are equal.

hr0nix avatar Sep 18 '22 15:09 hr0nix

After this change is merged and released, I'll put a note in README.md of optax-adan that there is no need to use the package as adan is now implemented in optax.

hr0nix avatar Sep 18 '22 15:09 hr0nix