coremltools icon indicating copy to clipboard operation
coremltools copied to clipboard

PyTorch to CoreML via convert() in v4.0b3 has several bugs with Flexible Input Shapes, seqLen and nFeatures swapped?

Open leovinus2001 opened this issue 5 years ago • 0 comments

Objective:

Advise the coremltools converter that the sequence length of the input is variable size. Useful in the context of LSTM, Transformers and other setups.

Reproducible:

Yes, all issues are reproduced in the test case, annotated and explained below, and in the logfile.

Summary:

BUG 1, General, the mlmodel spec shows that we have a duplicated shape (1,4,5) in the .mlmodel with ct.EnumeratedShapes(shapes= [ (1,4,5), (1,3,5) ] ) See also issue #756 BUG 2: TransformerEncoder, ct.EnumeratedShapes(shapes= [ (1,4,5), (1,3,5) ] ) is parsed or swapped in conversion, should work BUG 3: TransformerEncoder, ct.EnumeratedShapes(shapes= [ (1,3,5), (1,3,6) ] ) is parsed or swapped in conversion, should fail BUG 4: TransformerEncoder, ( 1, ct.RangeDim(2,10), 5 ) is parsed or swapped in conversion, should work BUG 5: TransformerEncoder, ( 1, 3, ct.RangeDim(2,10) ) is parsed or swapped in conversion, should fail

Details below.

Ground truth is here https://coremltools.readme.io/docs/flexible-inputs

Testcase:

Yes, testFlexibleShape.aug26.txt

Run as: python3 testFlexibleShape.aug26.py

This testcase runs through two different PyTorch models (LSTM, TransformerEncoder) which accepts variable length input, and test against five different input shapes in the ct.convert() Some fixed shapes, some enumerated, some rangeDims.

We test five input shapes

  • Fixed shape (1,3,5)
  • inputShape = ct.EnumeratedShapes(shapes= [ (1,4,5), (1,3,5) ] )
  • inputShape = ct.EnumeratedShapes(shapes= [ (1,3,5), (1,3,6) ] )
  • inputShape = ( 1, ct.RangeDim(2,10), 5 )
  • inputShape = ( 1, 3, ct.RangeDim(2,10) )

The first five tests (0,1,2,3,4) are done with the LSTM and the latter five (5,6,7,8,9) with TransformerEncoder.

Five bugs in total I think.

Setup:

macOS Catalina Python version : 3.7.6 (v3.7.6:43364a7ae0, Dec 18 2019, 14:18:50) [Clang 6.0 (clang-600.0.57)] Torch version : 1.6.0 CoreML tools version : 4.0b3

Log:

Log file is attached here. log.torch.1.6.0.txt

Interpretation and bug lists and issues for setup 0 to 9 :

0 -------------------------------------------------- TEST= 0 inputShape type = 0 LSTM, Expected to PASS, Convert PASS, Predict PASS

No BUG. This is basic behavior with fixed input shape as a sanity test on sequence Length 3, with 5 input features.

1 -------------------------------------------------- TEST= 0 inputShape type = 1 LSTM Expected to PASS, Convert PASS, Predict PASS

One BUG,

Behavior correct but BUG 1, the mlmodel spec shows that we have a duplicated shape (1,4,5) in the .mlmodel 2 -------------------------------------------------- TEST= 0 inputShape type = 2 LSTM Expected to FAIL, Convert FAIL, no predict, looks ok.

No BUG.

We get error ValueError: Incorrect weight matrix: hidden dim size mismatch. Provided (12, 28). Expecting <b, 4DIRECTIONH> but that makes sens as the inputShape is ct.EnumeratedShapes(shapes= [ (1,3,5), (1,3,6) ] ) and the LSTM here cannot process vectors [1 x 6] just [1 x 5].

3 -------------------------------------------------- TEST= 0 inputShape type = 3 LSTM Expected to PASS, Convert PASS, Predict PASS, no BUG

No BUG. with this RangeDim on sequenceLength.

4 -------------------------------------------------- TEST= 0 inputShape type = 4 LSTM Expected to FAIL, Convert FAIL, Predict FAIL, no BUG

No BUG.

We get error ValueError: Incorrect weight matrix: hidden dim size mismatch. Provided (12, 28). Expecting <b, 4DIRECTIONH> but that makes sens as the inputShape is ( 1, 3, ct.RangeDim(2,10) ) and the LSTM here can process input vectors [1 x 5] but not [1 x N]

5 -------------------------------------------------- TEST= 1 inputShape type = 0 TransformerEncoder Expected to PASS, Convert PASS, Predict PASS, no BUG

No BUG. This is basic behavior TransformerEncoder with fixed input shape as a sanity test on sequence Length 3, with 5 input features.

6 -------------------------------------------------- TEST= 1 inputShape type = 1 TransformerEncoder Expected to PASS, Convert() is PASS which is ok, Predict FAIL BUG

As the inputShape is now inputShape = ct.EnumeratedShapes(shapes= [ (1,4,5), (1,3,5) ] )

BUG 2:

First we see:

/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/coremltools/models/model.py:119: RuntimeWarning: You will not be able to run predict() on this Core ML model. Underlying exception message was: Error compiling model: "compiler error: Espresso exception: "Invalid blob shape": generic_elementwise_kernel: cannot broadcast [5, 4, 1, 1, 1] and [5, 3, 1, 1, 1]".

but the mlmodel spec in logfile looks ok actually.

The prediction fails with

RuntimeError: Error compiling model: "compiler error: Espresso exception: "Invalid blob shape": generic_elementwise_kernel: cannot broadcast [5, 4, 1, 1, 1] and [5, 3, 1, 1, 1]".

In a nutshell, the TransformerEncoder with flexible input shape ct.EnumeratedShapes(shapes= [ (1,4,5), (1,3,5) ] ) SHOULD WORK because we only change the sequence length (4 vs 3), and not the number of input features (5) (or d_model= )

7 -------------------------------------------------- TEST= 1 inputShape type = 2 TransformerEncoder Expected to FAIL, Convert PASS which is a BUG, Predict FAIL ok

As the inputShape is now inputShape = ct.EnumeratedShapes(shapes= [ (1,3,5), (1,3,6) ] ) we tell convert() that we can have sequenceLength==3 and numInputFeatures 5 or 6.

That makes no sense as the TransformerEncoder is hardwired to d_model=5 == number of input features

BUG 3:

HOWEVER, the conversion passes but it should fail.

Speculation: it looks like the coremltools flip/swap some input arguments like seqLen and nFeatures for TransformerEncoder but not LSTM

8 -------------------------------------------------- TEST= 1 inputShape type = 3 TransformerEncoder Expected to PASS, Convert PASS ok, Predict FAIL which is a bug

As the inputShape is now inputShape = ( 1, ct.RangeDim(2,10), 5 ) we tell the TransformerEncoder that we can get sequences in range [2 - 10] and keep the nInputFeatures to 5 == d_model

While the model spec looks good, we fail in prediction

BUG 4:

Traceback (most recent call last): File "testFlexibleShape.aug26.py", line 101, in outputCoreML = mlmodel.predict( { inputName[0]: dummy_input.detach().numpy() }, useCPUOnly=True) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/coremltools/models/model.py", line 367, in predict raise self._framework_error File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/coremltools/models/model.py", line 113, in _get_proxy_and_spec return (_MLModelProxy(filename, use_cpu_only), specification, None) RuntimeError: Error compiling model: "compiler error: Espresso exception: "Invalid argument": generic_reshape_kernel: Invalid bottom shape (5 2 1 1 1) for reshape to (5 3 1 1 1)". ERROR: PREDICT() on mlmodel failed! for test 1 inputShape type = 3 which is: Flexible, rangeDim shape, change the sequenceLength to a range [2,10]

9 -------------------------------------------------- TEST= 1 inputShape type = 4 TransformerEncoder Expected to FAIL, Convert PASS which is a BUG, Predict FAIL ok

As the inputShape is now inputShape = ( 1, 3, ct.RangeDim(2,10) ) we tell the TransformerEncoder that we can get number of input features in range [2 - 10] and keep the sequence length to 3.

BUG 5:

That should fail as such a dynamic model with variable input dimensions is not possible. The model convert() ONLY warns, but should fail actually!

Speculation: it looks like the coremltools flip/swap some input arguments like seqLen and nFeatures for TransformerEncoder but not LSTM

/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/coremltools/models/model.py:119: RuntimeWarning: You will not be able to run predict() on this Core ML model. Underlying exception message was: Error compiling model: "compiler error: Espresso exception: "Invalid blob shape": generic_elementwise_kernel: cannot broadcast [2, 3, 1, 1, 1] and [5, 3, 1, 1, 1]".

Traceback (most recent call last): File "testFlexibleShape.aug26.py", line 101, in outputCoreML = mlmodel.predict( { inputName[0]: dummy_input.detach().numpy() }, useCPUOnly=True) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/coremltools/models/model.py", line 367, in predict raise self._framework_error File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/coremltools/models/model.py", line 113, in _get_proxy_and_spec return (_MLModelProxy(filename, use_cpu_only), specification, None) RuntimeError: Error compiling model: "compiler error: Espresso exception: "Invalid blob shape": generic_elementwise_kernel: cannot broadcast [2, 3, 1, 1, 1] and [5, 3, 1, 1, 1]". ERROR: PREDICT() on mlmodel failed! for test 1 inputShape type = 4 which is: Flexible, rangeDim shape, change the numFeatures to a range [2,10] (which won't work for TransformerEncoder as [-1] == numFeatures)

leovinus2001 avatar Aug 26 '20 19:08 leovinus2001