tutorials icon indicating copy to clipboard operation
tutorials copied to clipboard

RuntimeError: ONNX export failed: Couldn't export operator aten::norm

Open stoneyang opened this issue 6 years ago • 7 comments

Hi, there:

I am trying to convert my model trained by PyTorch to ONNX format and I can't find solution to the problem presented in this title. The model is finetuned from resnet50 pretrained on ImageNet, shipped with PyTorch, and is with a surgery that the final 1000-way linear layer was replaced by a linear layer (A) with less number of way out, like 256. And in the conversion from PyTorch -> ONNX, I'd like to append an l2-norm operation after the layer A. I've referenced as cfer8395 advised in this thread. However, the problem persists. Could anyone shed me some light please?

CentOS Python 2.7 PyTorch 0.4.0 ONNX 1.6.0 TensorRT 6.0 CUDA 9.0

stoneyang avatar Nov 05 '19 12:11 stoneyang

@stoneyang - There have been many updates to the PyTorch-ONNX exporter since PyTorch 0.4. Could you please upgrade your PyTorch to the latest? There is support for some norm-type ops in the latest exporter.

spandantiwari avatar Nov 05 '19 18:11 spandantiwari

@spandantiwari Thanks for your hint. I will try upgrading pytorch to the latest version and report further progress.

stoneyang avatar Nov 06 '19 02:11 stoneyang

@spandantiwari I've suceeded exporting pytorch model to onnx weights, without changing my environment, adding normalization op into the onnx weights! Then I converted pytorch from 0.4.0 to 1.0.1 (this is because PyTorch 0.4.0 don't happy with TensorRT 6.0....). But, new problem shows: when converting the generated onnx file into a trt one, I received the following error:

Loading ONNX file from path ./model.onnx...
Beginning ONNX file parsing
Completed parsing of ONNX file
Building an engine from file ./model.onnx; this may take a while...
[TensorRT] ERROR: Network must have at least one output
Completed creating Engine
Traceback (most recent call last):
  File "deploy/onnx2trt.py", line 272, in <module>
    onnx2trt(args)
  File "deploy/onnx2trt.py", line 216, in onnx2trt
    engine = get_engine(args.max_batch_size, args.onnx_model_path, trt_engine_path, fp16_mode, int8_mode, save_engine=args.save_trt)
  File "deploy/onnx2trt.py", line 181, in get_engine
    return build_engine(max_batch_size, save_engine)
  File "deploy/onnx2trt.py", line 172, in build_engine
    f.write(engine.serialize())
AttributeError: 'NoneType' object has no attribute 'serialize'

Then, inspired from this thread, I tried to visualize the onnx weights, and the final output is just there as expected.

stoneyang avatar Nov 06 '19 07:11 stoneyang

@spandantiwari I've suceeded exporting pytorch model to onnx weights, without changing my environment, adding normalization op into the onnx weights! Then I converted pytorch from 0.4.0 to 1.0.1 (this is because PyTorch 0.4.0 don't happy with TensorRT 6.0....). But, new problem shows: when converting the generated onnx file into a trt one, I received the following error:

Loading ONNX file from path ./model.onnx...
Beginning ONNX file parsing
Completed parsing of ONNX file
Building an engine from file ./model.onnx; this may take a while...
[TensorRT] ERROR: Network must have at least one output
Completed creating Engine
Traceback (most recent call last):
  File "deploy/onnx2trt.py", line 272, in <module>
    onnx2trt(args)
  File "deploy/onnx2trt.py", line 216, in onnx2trt
    engine = get_engine(args.max_batch_size, args.onnx_model_path, trt_engine_path, fp16_mode, int8_mode, save_engine=args.save_trt)
  File "deploy/onnx2trt.py", line 181, in get_engine
    return build_engine(max_batch_size, save_engine)
  File "deploy/onnx2trt.py", line 172, in build_engine
    f.write(engine.serialize())
AttributeError: 'NoneType' object has no attribute 'serialize'

Then, inspired from this thread, I tried to visualize the onnx weights, and the final output is just there as expected.

Then I downgraded TensorRT 4.0, since it is compatible with product from PyTorch 0.4.0. After efforts of digging, I've managed to load the ONNX weights, but received another problem:

Loading ONNX file from path ./res/model.onnx...
Beginning ONNX file parsing
Completed parsing of ONNX file
Building an engine from file ./res/model.onnx; this may take a while...
input: "443"
input: "446"
output: "embed"
op_type: "Div"
attribute {
  name: "broadcast"
  i: 1
  type: INT
}
attribute {
  name: "axis"
  i: 0
  type: INT
}
doc_string: "/home/code/backbone.py(87): forward\n/home/.local/lib/python2.7/site-packages/torch/nn/modules/module.py(479): _slow_forward\n/home/.local/lib/python2.7/site-packages/torch/nn/modules/module.py(489): __call__\n/home/.local/lib/python2.7/site-packages/torch/jit/__init__.py(288): forward\n/home/.local/lib/python2.7/site-packages/torch/nn/modules/module.py(491): __call__\n/home/.local/lib/python2.7/site-packages/torch/jit/__init__.py(255): get_trace_graph\n/home/.local/lib/python2.7/site-packages/torch/onnx/utils.py(134): _export\n/home/.local/lib/python2.7/site-packages/torch/onnx/utils.py(84): export\n/home/.local/lib/python2.7/site-packages/torch/onnx/__init__.py(25): export\ndeploy/torch2onnx.py(77): torch2onnx\ndeploy/torch2onnx.py(86): <module>\n"
terminate called after throwing an instance of 'std::out_of_range'
  what():  No converter registered for op type: Div
Aborted

The Div op lives in PyTorch 0.4.0, which is supposed to be converted by TensorRT, am I right?

stoneyang avatar Nov 06 '19 09:11 stoneyang

@stoneyang - This seems like an issue with TensorRT and I can't speak to that. As a suggestion, I would recommend using latest PyTorch when working with ONNX exporter. That way you will have the latest updates.

spandantiwari avatar Nov 06 '19 21:11 spandantiwari

@spandantiwari Oh, I see. I have already started to try other versions, and further info will be updated asap.

stoneyang avatar Nov 07 '19 08:11 stoneyang

[TensorRT] ERROR: Network must have at least one output [TensorRT] ERROR: Network validation failed. Traceback (most recent call last): File "run_onnx.py", line 81, in run_onnx() File "run_onnx.py", line 67, in run_onnx with build_engine_onnx(onnx_model_file) as engine: AttributeError: enter

cloudrivers avatar Jan 30 '20 15:01 cloudrivers