Pinyu Su

Results 1 issues of Pinyu Su

1、背景 使用4张卡的A100训练robert模型,单卡可以训练,多卡一直卡死,并且**GPU利用率一直100%不发生变化,每张卡显存战用1.3G左右** Using A100 training Robert model with 4 GPU, single GPU can be trained, multiple GPU are stuck all the time, and the GPU utilization rate remains 100% unchanged....