matex icon indicating copy to clipboard operation
matex copied to clipboard

MPI AllReduce error

Open abidmalikwaterloo opened this issue 7 years ago • 0 comments

I got the following errors

2018-07-16 15:27:27.536541: W tensorflow/core/framework/op_kernel.cc:1192] Unknown: Exception: Message truncated, error stack: MPI_Allreduce(855)..................: MPI_Allreduce(sbuf=MPI_IN_PLACE, rbuf=0x2049aaa00, count=256, MPI_FLOAT, MPI_SUM, MPI_COMM_WORLD) failed MPIR_Allreduce_impl(712)............: MPIR_Allreduce_intra(357)...........: MPIC_Sendrecv(186)..................: MPIDI_CH3U_Request_unpack_uebuf(599): Message truncated; 1536 bytes received but buffer size is 1024

Any comments!

abidmalikwaterloo avatar Jul 16 '18 20:07 abidmalikwaterloo