GradCache icon indicating copy to clipboard operation
GradCache copied to clipboard

functional approach with distributed training

Open kevinlin311tw opened this issue 4 years ago • 3 comments

Thank you for the great work!

Could you please provide some examples about functional approach with distributed multi-gpu training?

kevinlin311tw avatar Jan 18 '22 23:01 kevinlin311tw

Hi @kevinlin311tw , sure, I can add an example in a day or two.

As a side note, the functional approach itself is actually agnostic to parallelism: you need only to wrap your encoder model and do cross process communication in the loss function. Maybe this comment will be helpful if you want to give it a try yourself.

luyug avatar Jan 20 '22 02:01 luyug

I've added an example in the readme, along with a new all-gather decorator that may be helpful.

Feel free to ping me if you have any questions or find any problems with the code.

luyug avatar Jan 21 '22 04:01 luyug