Question about grad accumulation
hello,I want to finetune pi0_base on my machine,but i only have 48G GPU memory,if there any setting allow me use grad accumulation?
I have found the FSDP setting,I will try it sooner~
for using gradient accumulation, you can wrap the optimizer with optax.MultiSteps: https://optax.readthedocs.io/en/latest/api/optimizer_wrappers.html#optax.MultiSteps.
we didn't include in the release but i have tried myself personally but let me know if you run into issues.
more ref: https://optax.readthedocs.io/en/latest/_collections/examples/gradient_accumulation.html
thanks for your help! I will try using multi gpu setting FSDP first
We've decided to implement grad accumulation. Please stay tuned.
We've decided to implement grad accumulation. Please stay tuned.
thanks for your reply! I'm really looking forward for the update!