CNTK fails in building a model with the layer `ConvLSTM2D` in some cases

Open shiningrain opened this issue 5 years ago • 0 comments

System information

Have I written custom code (as opposed to using example directory):
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10 & Linux Ubuntu 18.04
CNTK version: 2.7
Python version: 3.6.9
CUDA/cuDNN version: -
GPU model and memory: -

Describe the current behavior

CNTK fails in building a model with the layer ConvLSTM2D, the error is shown below (cntk/ops/__init__.py line 421). For detailed parameters of ConvLSTM2D, you can refer to the following code snippet.

<class 'ValueError'>, ValueError('Convolution operation requires that kernel dim 5 <= input dim 4

Analysis for CNTK error

We found this CNTK error is jointly influenced by the input dimension and kernel_size. The input dimension is calculated in keras\backend\cntk_backend.py line 1731, shown as the following picture.

Through the calculation in line 1731, different parameters will get different shapes, as shown in the following table.

kernel_size=4; input_shape=[16,16,8]

strides	output shape after `convolution`
1	[8,13,5]
2	[8,7,3]
3	[8,5,2]
4	[8,4,2]
5	[8,3,1]

The output shape after convolution operation will be used in convolution in cntk\ops\__init__.py line 421, shown as the following picture.

In this convolution process, the 3rd dimension value of convolution_map (e.g., 4 here) will compare with the 2nd dimension value of toSequence_Input4(e.g., 13). And the 4th dimension value of convolution_map(e.g., 4 here) will compare with the 3rd dimension value of toSequence_Input4(e.g., 5 here). If the value in convolution_mapis greater than the corresponding value in toSequence_Input4, CNTK will arise an error like ValueError('Convolution operation requires that kernel dim xx <= input dim xx.

As a example, if we set strides=1 and kernel_size=4 or (4,4), the shape of toSequence_Input4 will be [8,13,5], as shown in the above picture. 4 is less than 5 or 13, and the error will not be triggered in this case. Contrastly, If we change the strides=3, the shape of toSequence_Input4 will be [8,5.2]. 4 is greater than 2, and this leads to the error ValueError('Convolution operation requires that kernel dim 4 <= input dim 2.

Key insights

The implementation of ConvLSTM2D on CNTK seems to have flaws. As summarized above, when the value of kernel_size exceeds input dimension, CNTK will crash to build a model.

Additionally, there is no relevant hints or warnings in the documents for this problem, which may confuse the CNTK users.

Application

When I run a MobileNetv2 model ( with ConvLSTM2D) on CNTK, it will arise the error shown in the following picture. However, this model can be normally predicted on Tensorflow and Theano, and get similar outputs. a823233a5c610c9e134716baeca692f bff796aa9382f07a5f13564c2d983f4 73ef67f8780bc837554e905a89b93cc

Code to reproduce the issue

import numpy as np
import keras.layers as L
from keras.engine import Model, Input

## Using CNTK as Keras backend.
## Input dtype default is float32

kwargs = {
    'filters': 8,
		'kernel_size': 5, 
    #'kernel_size': (1,5), 
		'strides':1,
		'padding': 'valid',
		'data_format': 'channels_first',
		'dilation_rate': 1,
		'use_bias': True,
		'unit_forget_bias': False,
		'return_sequences': True,
		'go_backwards': False,
		'stateful': False,
		'dropout': 0.7232577469807254,
		'recurrent_dropout': 0.7507926892266159
}

input = (10 * np.random.random((1,16,16,16,8)))
layer = L.convolutional_recurrent.ConvLSTM2D(**kwargs)
x = Input(batch_shape=input.shape)
y = layer(x)
bk_model = Model(x, y)
print('finish')

Mar 05 '20 06:03 shiningrain