uda tf.contrib.distribute.MirroredStrategy

Hi there,

I noticed that the below code was commented out and would love to learn more behind the thinking on this approach as I would like to utilize more than 1 GPU.

train_distribute = tf.contrib.distribute.MirroredStrategy( num_gpus=FLAGS.num_gpu)

Any lights on this would be greatly appreciated.

Thanks, Roy

Oct 13 '19 23:10 ruiheh

Hi Roy,

We tried to use multiple GPU by using MirroredStrategy, but the overhead is very high. With 4 GPUs, the speedup on using 1 GPU is far lower than 4. We didn't find a good way to use multiple GPU so we just skipped it. There might be some progress on using multiple GPUs for BERT after we released our code.

Qizhe

Oct 15 '19 17:10 michaelpulsewidth

Thanks Qizhe

On Tue, Oct 15, 2019 at 1:24 PM Qizhe Xie [email protected] wrote:

Hi Roy,

We tried to use multiple GPU by using MirroredStrategy, but the overhead is very high. With 4 GPUs, the speedup on using 1 GPU is far lower than 4. We didn't find a good way to use multiple GPU so we just skipped it. There might be some progress on using multiple GPUs for BERT after we released our code.

Qizhe

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google-research/uda/issues/56?email_source=notifications&email_token=ABLHSL4SSG2ZV3SISRPAEQLQOX4F5A5CNFSM4JAJERV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBJSHWQ#issuecomment-542319578, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLHSL4P5SK2AD4SBW4DP3TQOX4F5ANCNFSM4JAJERVQ .

Oct 15 '19 18:10 ruiheh