WaveFill icon indicating copy to clipboard operation
WaveFill copied to clipboard

Problem about multi - GPU training

Open Junjie31 opened this issue 3 years ago • 1 comments

Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: Function 'L1LossBackward0' returned nan values in its 0th output.

Junjie31 avatar Nov 07 '22 14:11 Junjie31

we didn't encounter such issue, maybe can apply an eps on the l1 loss to prevent nan

yingchen001 avatar Nov 14 '22 05:11 yingchen001