OOM when allocating tensor with shape[512,256,14,14]
@souryuu When i took the latest version of your code for training, i got the below error. Did you faced this kind of issue?
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[512,256,14,14] [[Node: pyramid_1/Conv2d_transpose/conv2d_transpose = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](pyramid_1/Conv2d_transpose/stack, pyramid/Conv2d_transpose/weights/read, pyramid_1/Conv_3/Relu)]] [[Node: pyramid_2/Reshape_72/_2085 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_10899_pyramid_2/Reshape_72", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]]
Caused by op u'pyramid_1/Conv2d_transpose/conv2d_transpose', defined at:
File "train/train.py", line 361, in
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[512,256,14,14] [[Node: pyramid_1/Conv2d_transpose/conv2d_transpose = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](pyramid_1/Conv2d_transpose/stack, pyramid/Conv2d_transpose/weights/read, pyramid_1/Conv_3/Relu)]] [[Node: pyramid_2/Reshape_72/_2085 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_10899_pyramid_2/Reshape_72", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]]
Regards, Sharath
use smaller batch size like 8 or 12 also smaller vocab. check https://github.com/tensorflow/nmt/issues/348 thats what i am doing right now.. i'll get back to this post if it doesnt work