TensorRT failed to build the serialized network when running a valid onnx model on GPU 3080: dimensions not compatible for Gather with GatherMode = kND

Description

For the following onnx model, it can be imported by the onnx frontend in TensorRT. However, it failes to build the serialized network. The following error message is produced:

[05/28/2025-16:06:34] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 4: API Usage Error (IGatherLayer gathernd_node: gathernd_node: dimensions not compatible for Gather with GatherMode = kND)

Environment

TensorRT Version: 10.11.0.33

NVIDIA GPU: GeForce RTX 3080

NVIDIA Driver Version: 535.183.01

CUDA Version: 12.2

CUDNN Version: none

Operating System: ubuntu 20.04

Python Version (if applicable): 3.12.9

Tensorflow Version (if applicable): none

PyTorch Version (if applicable): none

Baremetal or Container (if so, version): none

Steps To Reproduce

This bug can be reproduced by the following code with the model in the attachment. As shown in the code, the model can be executed by onnxruntime.

from typing import Dict, List, Literal, Optional
import sys
import os

import numpy as np
import onnx
import onnxruntime
from onnx import ModelProto, TensorProto, helper, mapping

import tensorrt as trt
import pycuda.driver as cuda
import pycuda.autoinit

import pickle


def test():
    onnx_model = onnx.load("111.onnx")
    onnx_model.ir_version = 8
    onnx_model.opset_import[0].version = 14
    
    with open("inputs.pkl", "rb") as fp:
        inputs = pickle.load(fp)

    try:
        ort_session = onnxruntime.InferenceSession(
            onnx_model.SerializeToString(), providers=["CPUExecutionProvider"]
        )
        ort_output = ort_session.run([], inputs)
    except Exception as e:
        print(e)
        print("This model cannot be executed by onnxruntime!")
        sys.exit(1)
    
    print("ONNXRuntime:\n", ort_output)
    
    #--------------------------------------------------------
        
    trt_logger = trt.Logger(trt.Logger.WARNING)
    trt.init_libnvinfer_plugins(trt_logger, '')
    builder = trt.Builder(trt_logger)
    network = builder.create_network(flags=1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))

    parser = trt.OnnxParser(network, trt_logger)
    with open("111.onnx", 'rb') as model_file:
        if not parser.parse(model_file.read()):
            for error in range(parser.num_errors):
                print(parser.get_error(error))
            sys.exit(1)
            
    config = builder.create_builder_config()
    serialized_engine = builder.build_serialized_network(network, config)
    
    if serialized_engine == None:
        sys.exit(1)
    
if __name__ == "__main__":
    test()

testcase.zip

Commands or scripts:

Have you tried the latest release?: yes

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt): the mode can be executed by onnxruntime.

May 28 '25 08:05 coffezhou

Well there are three options to resolve the issue Basically, the checker inside the ONNX is baked into each compiled wheel.

I have tried first with the 3rd option, it worked, and then I tried with the 1 first bit long process , later with the second quite easy

So I tried by downloading the onnx package and then adding this inside the onnx/checker.h file under the namespace called checker, inside that

constexpr uint32_t IR_VERSION = 0x0000000A; // pretend IR v10
Shim it in Python
Temporarily spoof ir_version to 10 for simplification, then restore:

import sys, onnx
from onnxsim import simplify
IN, OUT = sys.argv[1:]
model = onnx.load(IN)
orig_ir = model.ir_version
if orig_ir > 10:
model.ir_version = 10

opt_model, _ = simplify(model, dynamic_input_shape=True)
opt_model.ir_version = orig_ir
onnx.save(opt_model, OUT)

2) temporarily spoof by creating a seperate python file that 
  ```import sys, tempfile, onnx
  from pathlib import Path
  from onnxsim import simplify
  
  IN, OUT, *_ = sys.argv[1:]
  PRINT = "--print" in sys.argv
 
  m = onnx.load(IN)
  real_ir = m.ir_version  

  if real_ir > 10:
      m.ir_version = 10
  
  model_opt, _ = simplify(m, dynamic_input_shape=True)```
 
  model_opt.ir_version = real_ir
  onnx.save(model_opt, OUT)
  
  if PRINT:
      print("saved", OUT, "with ir_version", real_ir)

simplest one skip the checker which I woulnd't do that fastest way for quick fix I would say python -m onnxsim 111.onnx 111_static.onnx --skip-checker

May 28 '25 19:05 abinesha312

Well there are three options to resolve the issue Basically, the checker inside the ONNX is baked into each compiled wheel.

I have tried first with the 3rd option, it worked, and then I tried with the 1 first bit long process , later with the second quite easy

So I tried by downloading the onnx package and then adding this inside the onnx/checker.h file under the namespace called checker, inside that
constexpr uint32_t IR_VERSION = 0x0000000A; // pretend IR v10
Shim it in Python
Temporarily spoof ir_version to 10 for simplification, then restore:

import sys, onnx
from onnxsim import simplify
IN, OUT = sys.argv[1:]
model = onnx.load(IN)
orig_ir = model.ir_version
if orig_ir > 10:
model.ir_version = 10

opt_model, _ = simplify(model, dynamic_input_shape=True)
opt_model.ir_version = orig_ir
onnx.save(opt_model, OUT)

2) temporarily spoof by creating a seperate python file that 
  ```import sys, tempfile, onnx
  from pathlib import Path
  from onnxsim import simplify
  
  IN, OUT, *_ = sys.argv[1:]
  PRINT = "--print" in sys.argv
 
  m = onnx.load(IN)
  real_ir = m.ir_version  

  if real_ir > 10:
      m.ir_version = 10
  
  model_opt, _ = simplify(m, dynamic_input_shape=True)```
 
  model_opt.ir_version = real_ir
  onnx.save(model_opt, OUT)
  
  if PRINT:
      print("saved", OUT, "with ir_version", real_ir)
simplest one skip the checker which I woulnd't do that fastest way for quick fix I would say python -m onnxsim 111.onnx 111_static.onnx --skip-checker

Thank you for your reply! I have tried your method, but the issue is still here. The version of onnxsim is 0.4.36.

May 29 '25 04:05 coffezhou

Obviously, there might be some version conflicts with TensorFlow, so you might need to fix those and tweak your functions, or you could ignore the errors as mentioned in the other steps.

I ran into a similar issue with Pydantic and vLLM-GPU. I tried a bunch of approaches, but version conflicts kept popping up, so I ended up ignoring the warnings. It wasn’t ideal, but it did the trick.

Jun 02 '25 14:06 abinesha312