Invalid argument: logits and labels must have the same first dimension
When i try the NMT tuition with default params,i got this following error. and i don't know how to fix it.
`INFO:tensorflow:Creating vocabulary lookup table of size 37007 INFO:tensorflow:Creating vocabulary lookup table of size 37007 INFO:tensorflow:Creating BidirectionalRNNEncoder in mode=eval INFO:tensorflow: BidirectionalRNNEncoder: init_scale: 0.04 rnn_cell: cell_class: GRUCell cell_params: {num_units: 128} dropout_input_keep_prob: 0.8 dropout_output_keep_prob: 1.0 num_layers: 1 residual_combiner: add residual_connections: false residual_dense: false
INFO:tensorflow:Creating AttentionLayerDot in mode=eval INFO:tensorflow: AttentionLayerDot: {num_units: 128}
INFO:tensorflow:Creating AttentionDecoder in mode=eval INFO:tensorflow: AttentionDecoder: init_scale: 0.04 max_decode_length: 100 rnn_cell: cell_class: GRUCell cell_params: {num_units: 128} dropout_input_keep_prob: 0.8 dropout_output_keep_prob: 1.0 num_layers: 1 residual_combiner: add residual_connections: false residual_dense: false
INFO:tensorflow:Creating ZeroBridge in mode=eval INFO:tensorflow: ZeroBridge: {}
INFO:tensorflow:Starting evaluation at 2017-04-19-09:18:33 I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:02:00.0) I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 1080, pci bus id: 0000:03:00.0) W tensorflow/core/framework/op_kernel.cc:993] Out of range: Reached limit of 1 [[Node: dev_input_fn/parallel_read_1/filenames/limit_epochs/CountUpTo = CountUpToT=DT_INT64, _class=["loc:@dev_input_fn/parallel_read_1/filenames/limit_epochs/epochs"], limit=1, _device="/job:localhost/replica:0/task:0/cpu:0"]] W tensorflow/core/framework/op_kernel.cc:993] Invalid argument: logits and labels must have the same first dimension, got logits shape [1344,37007] and labels shape [1568] [[Node: model/att_seq2seq/cross_entropy_sequence_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT64, _device="/job:localhost/replica:0/task:0/gpu:0"](model/att_seq2seq/cross_entropy_sequence_loss/SparseSoftmaxCrossEntropyWithLogits/Reshape, model/att_seq2seq/cross_entropy_sequence_loss/SparseSoftmaxCrossEntropyWithLogits/Reshape_1)]] W tensorflow/core/framework/op_kernel.cc:993] Invalid argument: logits and labels must have the same first dimension, got logits shape [1344,37007] and labels shape [1568] [[Node: model/att_seq2seq/cross_entropy_sequence_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT64, _device="/job:localhost/replica:0/task:0/gpu:0"](model/att_seq2seq/cross_entropy_sequence_loss/SparseSoftmaxCrossEntropyWithLogits/Reshape, model/att_seq2seq/cross_entropy_sequence_loss/SparseSoftmaxCrossEntropyWithLogits/Reshape_1)]]
...
InvalidArgumentError (see above for traceback): logits and labels must have the same first dimension, got logits shape [1344,37007] and labels shape [1568] [[Node: model/att_seq2seq/cross_entropy_sequence_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT64, _device="/job:localhost/replica:0/task:0/gpu:0"](model/att_seq2seq/cross_entropy_sequence_loss/SparseSoftmaxCrossEntropyWithLogits/Reshape, model/att_seq2seq/cross_entropy_sequence_loss/SparseSoftmaxCrossEntropyWithLogits/Reshape_1)]] [[Node: mean/broadcast_weights/assert_broadcastable/is_valid_shape/has_valid_nonscalar_shape/has_invalid_dims/ExpandDims_1/_415 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_433_mean/broadcast_weights/assert_broadcastable/is_valid_shape/has_valid_nonscalar_shape/has_invalid_dims/ExpandDims_1", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"]]
`
hi, did you solve this problem, I think I have the same problem. but no idea.
Bumping this issue. I'm having the same problem.
have you found that, decoder outputs have 1 less time steps? Strange
I have the same exception, and my error happens in 9th batch. It is weird the former batches have no error!
@WilliamJoe992 Have you solved this issue? I have exact the same exception as yours as it happens after several normal batches . And it only happens when i use multi GPUs for parallel training.
have you found that, decoder outputs have 1 less time steps? Strange
Incredibly late to the party, but I was having a similar issue. Turns out I was setting my actual sequence lengths before adding my START/END tokens, which might be what is/was causing your issue