functional approach with distributed training

Open kevinlin311tw opened this issue 4 years ago • 3 comments

Thank you for the great work!

Could you please provide some examples about functional approach with distributed multi-gpu training?

Jan 18 '22 23:01 kevinlin311tw

Hi @kevinlin311tw , sure, I can add an example in a day or two.

As a side note, the functional approach itself is actually agnostic to parallelism: you need only to wrap your encoder model and do cross process communication in the loss function. Maybe this comment will be helpful if you want to give it a try yourself.

Jan 20 '22 02:01 luyug

I've added an example in the readme, along with a new all-gather decorator that may be helpful.

Feel free to ping me if you have any questions or find any problems with the code.

Jan 21 '22 04:01 luyug