NeMo fastpitch onnx convert to tensorrt failure of TensorRT 10.3.0
Description
Environment
I'm using this docker image: nvcr.io/nvidia/tensorrt:24.08-py3
TensorRT Version:
NVIDIA GPU: L40S
NVIDIA Driver Version: 535.183.01
CUDA Version: 12.6
CUDNN Version: 9.3.0
Operating System: Ubuntu22.04
Python Version (if applicable): 3.10.12
PyTorch Version (if applicable): 2.4.0
Relevant Files
Model link:
Steps To Reproduce
First install NeMo
pip install nemo_toolkit['tts']
- Code used to generate the onnx model.
from nemo.collections.tts.models.fastpitch import FastPitchModel
spec_model = FastPitchModel.from_pretrained("tts_en_fastpitch")
spec_model.export('ljspeech.onnx', onnx_opset_version=20)
- command that reproduce the error
trtexec --onnx=ljspeech.onnx --minShapes=text:1x32,pitch:1x32,pace:1x32 --optShapes=text:1x768,pitch:1x768,pace:1x768 --maxShapes=text:1x1664,pitch:1x1664,pace:1x1664 --shapes=text:1x768,pitch:1x768,pace:1x768 --memPoolSize=workspace:4096 --noTF32 --saveEngine=ljspeech.engine
- The error is:
[E] Error[7]: IExecutionContext::enqueueV3: Error Code 7: Internal Error (/decoder/layers.0/dec_attn/MatMul_1: attempt to multiply two matrices with mismatching dimensions Condition '==' violated: 0 != 1. Instruction: CHECK_EQUAL 0 1.)
[E] Error occurred during inference
Commands or scripts:
Have you tried the latest release?: yes
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt): yes
Can you verify that the shapes provided to the trtexec call are valid? The error could be caused due to invalid shape profile passed in.
Can you verify that the shapes provided to the trtexec call are valid? The error could be caused due to invalid shape profile passed in.
Thanks for response, after changing the input shape, I am able to execute the trtexec command and generate the engine file. But there is another problem now.
trtexec --onnx=ljspeech.onnx --minShapes=text:1x32,pitch:1x32,pace:1x32 --optShapes=text:1x128,pitch:1x128,pace:1x128 --maxShapes=text:1x128,pitch:1x128,pace:1x128 --shapes=text:1x128,pitch:1x128,pace:1x128 --memPoolSize=workspace:4096 --noTF32 --saveEngine=ljspeech.engine
After setting the dynamic input shape, the output shape did not change, so I can not determine the real output shape.
Correct me if I'm wrong.
import tensorrt as trt
TRT_LOGGER = trt.Logger(trt.Logger.ERROR)
trt.init_libnvinfer_plugins(TRT_LOGGER, '')
engine_filepath = 'ljspeech.engine'
with open(engine_filepath, "rb") as f, trt.Runtime(TRT_LOGGER) as runtime:
engine = runtime.deserialize_cuda_engine(f.read())
context = engine.create_execution_context()
context.set_input_shape('text', (1,33))
context.set_input_shape('pitch', (1,33))
context.set_input_shape('pace', (1,33))
print('all_binding_shapes_specified: ', context.all_binding_shapes_specified)
print('spect shape: ', context.get_tensor_shape('spect'))
print('num_frames', context.get_tensor_shape('num_frames'))
print('durs_predicted', context.get_tensor_shape('durs_predicted'))
print('log_durs_predicted', context.get_tensor_shape('log_durs_predicted'))
print('pitch_predicted', context.get_tensor_shape('pitch_predicted'))
The output is:
all_binding_shapes_specified: True
spect shape: (1, 80, -1)
num_frames (1,)
durs_predicted (1, 33)
log_durs_predicted (1, 33)
pitch_predicted (1, 33)
@yuananf Which are the specific dims that have wrong value? IIUC, some of them have changed.
Can you confirm that if the "specific dims" are the ones that mentioned in the API doc?
A dimension in an output tensor will have a -1 wildcard value if the dimension depends on values of execution tensors OR if all the following are true: It is a runtime dimension. setInputShape() has NOT been called for some input tensor(s) with a runtime shape. setTensorAddress() has NOT been called for some input tensor(s) with isShapeInferenceIO() = true. An output tensor may also have -1 wildcard dimensions if its shape depends on values of tensors supplied to enqueueV3().
@yuananf Which are the specific dims that have wrong value? IIUC, some of them have changed.
Can you confirm that if the "specific dims" are the ones that mentioned in the API doc?
A dimension in an output tensor will have a -1 wildcard value if the dimension depends on values of execution tensors OR if all the following are true: It is a runtime dimension. setInputShape() has NOT been called for some input tensor(s) with a runtime shape. setTensorAddress() has NOT been called for some input tensor(s) with isShapeInferenceIO() = true. An output tensor may also have -1 wildcard dimensions if its shape depends on values of tensors supplied to enqueueV3().
As you can see from previous comment, the output shape of spect is still (1, 80, -1) after all input shapes are set.
I can confirm set_input_shape are called for all input tensors.
So the reason might be An output tensor may also have -1 wildcard dimensions if its shape depends on values of tensors supplied to [enqueueV3()]
What does this mean?
@yuananf Which are the specific dims that have wrong value? IIUC, some of them have changed.
Can you confirm that if the "specific dims" are the ones that mentioned in the API doc?
A dimension in an output tensor will have a -1 wildcard value if the dimension depends on values of execution tensors OR if all the following are true: It is a runtime dimension. setInputShape() has NOT been called for some input tensor(s) with a runtime shape. setTensorAddress() has NOT been called for some input tensor(s) with isShapeInferenceIO() = true. An output tensor may also have -1 wildcard dimensions if its shape depends on values of tensors supplied to enqueueV3().
Any update on this issue?