armnn Fail to modify graph of transformers models

Hi, any tips on debugging or working around this issue would be appreciated.

Issue: ArmNN fails to modify tflite graph of transformer models. This issue occurs with many transformer models, but this is best reproduced with the GPT2 model.

ArmNN version: ArmNN v29.0.0, release v22.05.01, prebuilt binaries for x86 ACL version: Not downloaded TfLite version: tensorflow == 2.10.0rc0 System OS: Linux Device: Graviton gen 1 (ec2 a1)

Download gpt2-64.tflite (instructions here)

import tensorflow as tf
from transformers import TFGPT2LMHeadModel

model = TFGPT2LMHeadModel.from_pretrained('gpt2') # or 'distilgpt2'

input_spec = tf.TensorSpec([1, 64], tf.int32)
model._set_inputs(input_spec, training=False)

print(model.inputs)
print(model.outputs)

converter = tf.lite.TFLiteConverter.from_keras_model(model)

# For FP16 quantization:
# converter.optimizations = [tf.lite.Optimize.DEFAULT]
# converter.target_spec.supported_types = [tf.float16]

tflite_model = converter.convert()

open("gpt2-64.tflite", "wb").write(tflite_model)

There is also a second method to do this that produces the same results

import tensorflow as tf
from transformers import TFGPT2LMHeadModel

base_model = TFGPT2LMHeadModel.from_pretrained('gpt2') # or 'distilgpt2'
input_ids = tf.keras.layers.Input((32, ), batch_size=None, dtype=tf.int32, name="input_ids")
inputs = [input_ids]

outputs = base_model(inputs)
model = tf.keras.models.Model(inputs=inputs, outputs=outputs)

converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
open("gpt2-method2.tflite", "wb").write(tflite_model)

Attempt to set up interpreter

import tflite_runtime.interpreter

model_path = "gpt2-64.tflite"
delegate_lib_path = "~/ArmNN-linux-aarch64/libarmnnDelegate.so"

delegate = tflite_runtime.interpreter.load_delegate(
                    library=delegate_lib_path,
                    options={"backends": "CpuAcc,CpuRef", "logging-severity": "info"},
                )
interpreter = tflite_runtime.interpreter.Interpreter(
            model_path=model_path, experimental_delegates=self._delegates,
        )

Console output

Info: ArmNN v29.0.0
Warning: WARNING: The given backend path "/build/armnn-YyCEBh/armnn-22.05.01" does not exist
Info: Initialization time: 44.16 ms.
INFO: TfLiteArmnnDelegate: Created TfLite ArmNN delegate.
Traceback (most recent call last):
  File "tflite_engine.py", line 82, in __init__
    self._interpreter = tflite_runtime.interpreter.Interpreter(
  File "/home/ubuntu/tflite/tflite/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 495, in __init__
    self._interpreter.ModifyGraphWithDelegate(
RuntimeError
Info: Shutdown time: 0.02 ms.

This error is pretty unhelpful. This model (and other transformer models) work fine with XNNPACK. I've tried with a both CpuAcc and CpuRef backends. Any theories or insights are appreciated!

Aug 22 '22 19:08 kylesayrs

I recently found that ArmNN does not support layer norm. If this is the issue, it'd be nice if the error reflected that

Aug 22 '22 19:08 kylesayrs

@kylesayrs from output, it seem thats can not find libarmnn.so ,can you check that libarmnn.so is exit on the path "/build/armnn-YyCEBh/armnn-22.05.01"

just like BuildGuideNative.md cd $BASEDIR/armnn/delegate/build ./DelegateUnitTests --test-suite=*CpuAcc* or LD_LIBRARY_PATH=../armnn/build ./benchmark_model --graph=mobilenet_v2_1.0_224_quantized_1_default_1.tflite --external_delegate_path="../armnn/build/delegate/libarmnnDelegate.so" --external_delegate_options="backends:CpuAcc;logging-severity:info"

Aug 23 '22 05:08 xiaotongnii

Hi @kylesayrs , It looks like a tflite runtime error rather than an armnn layer error, So I doubt it has anything to do with layer norm (yet?). As Shelton said, it seems like an LD_LIBRARY_PATH issue. Can you point it to the directory that contains the pre-built binaries?

Aug 23 '22 07:08 ArmRyan

Thanks for the responses. I doubt this directly the issue, since I've seen this "backend missing" warning before with resnets with no issues running.

I've downloaded the prebuilt binaries, is there any documentation on what needs to be done to use them? I'd rather not build the delegate if I can avoid it, but I agree that it seems like libarmnn.so is not being found.

Could its path be specified in load_delegate(options=?}. Is there documentation on what options are available? (I haven't found any)

Thanks

Aug 25 '22 20:08 kylesayrs

We have this quick start guide or this slightly more in-depth guide that might be able to help you.

Options can be found here

Adding the directory of the libarmnn.so to the LD_LIBRARY_PATH should be sufficient for the application to find it

Aug 30 '22 09:08 ArmRyan

Thanks @ArmRyan, those resources are very helpful.

I've added the libraries to my LD_LIBRARY_PATH and no longer get the warning, but outside of that nothing has changed

Info: ArmNN v29.0.0
Info: Initialization time: 0.98 ms.
INFO: TfLiteArmnnDelegate: Created TfLite ArmNN delegate.
Traceback (most recent call last):
  File "/home/ubuntu/arm-competitive-benchmarking/env/bin/deepsparse.benchmark", line 33, in <module>
    sys.exit(load_entry_point('deepsparse-nightly', 'console_scripts', 'deepsparse.benchmark')())
  File "/home/ubuntu/arm-competitive-benchmarking/deepsparse/src/deepsparse/benchmark/benchmark_model.py", line 446, in main
    result = benchmark_model(
  File "/home/ubuntu/arm-competitive-benchmarking/deepsparse/src/deepsparse/benchmark/benchmark_model.py", line 378, in benchmark_model
    model = TFLEngine(
  File "/home/ubuntu/arm-competitive-benchmarking/deepsparse/src/deepsparse/benchmark/tflite_engine.py", line 85, in __init__
    self._interpreter = tflite_runtime.interpreter.Interpreter(
  File "/home/ubuntu/arm-competitive-benchmarking/env/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 495, in __init__
    self._interpreter.ModifyGraphWithDelegate(
RuntimeError
Info: Shutdown time: 0.02 ms.

Aug 30 '22 18:08 kylesayrs

On Raspberry Pi I got the error Warning: GetConnectedConstantAsInputTensors() called on Layer with no connected Constants as Input Tensors.

Sep 02 '22 16:09 kylesayrs

Hey @kylesayrs,

I'm sorry for the delay, but did you run with the latest prebuilt binaries?

If not, could you try again please with these? I cannot see where this warning is coming from. I am also surprised that it is coming up as an error as well. It should not prevent you from doing what you would like when it is a warning.

Oct 05 '22 12:10 keidav01

Hi @kylesayrs

I'm closing this as inactive, if you still need help then please reopen the issue or else create a new one.

Best regards, Mike

Dec 08 '22 10:12 MikeJKelly