RuntimeError: ONNX export failed: Couldn't export operator aten::norm
Hi, there:
I am trying to convert my model trained by PyTorch to ONNX format and I can't find solution to the problem presented in this title. The model is finetuned from resnet50 pretrained on ImageNet, shipped with PyTorch, and is with a surgery that the final 1000-way linear layer was replaced by a linear layer (A) with less number of way out, like 256. And in the conversion from PyTorch -> ONNX, I'd like to append an l2-norm operation after the layer A. I've referenced as cfer8395 advised in this thread. However, the problem persists. Could anyone shed me some light please?
CentOS Python 2.7 PyTorch 0.4.0 ONNX 1.6.0 TensorRT 6.0 CUDA 9.0
@stoneyang - There have been many updates to the PyTorch-ONNX exporter since PyTorch 0.4. Could you please upgrade your PyTorch to the latest? There is support for some norm-type ops in the latest exporter.
@spandantiwari Thanks for your hint. I will try upgrading pytorch to the latest version and report further progress.
@spandantiwari I've suceeded exporting pytorch model to onnx weights, without changing my environment, adding normalization op into the onnx weights! Then I converted pytorch from 0.4.0 to 1.0.1 (this is because PyTorch 0.4.0 don't happy with TensorRT 6.0....). But, new problem shows: when converting the generated onnx file into a trt one, I received the following error:
Loading ONNX file from path ./model.onnx...
Beginning ONNX file parsing
Completed parsing of ONNX file
Building an engine from file ./model.onnx; this may take a while...
[TensorRT] ERROR: Network must have at least one output
Completed creating Engine
Traceback (most recent call last):
File "deploy/onnx2trt.py", line 272, in <module>
onnx2trt(args)
File "deploy/onnx2trt.py", line 216, in onnx2trt
engine = get_engine(args.max_batch_size, args.onnx_model_path, trt_engine_path, fp16_mode, int8_mode, save_engine=args.save_trt)
File "deploy/onnx2trt.py", line 181, in get_engine
return build_engine(max_batch_size, save_engine)
File "deploy/onnx2trt.py", line 172, in build_engine
f.write(engine.serialize())
AttributeError: 'NoneType' object has no attribute 'serialize'
Then, inspired from this thread, I tried to visualize the onnx weights, and the final output is just there as expected.
@spandantiwari I've suceeded exporting pytorch model to onnx weights, without changing my environment, adding normalization op into the onnx weights! Then I converted pytorch from 0.4.0 to 1.0.1 (this is because PyTorch 0.4.0 don't happy with TensorRT 6.0....). But, new problem shows: when converting the generated onnx file into a trt one, I received the following error:
Loading ONNX file from path ./model.onnx... Beginning ONNX file parsing Completed parsing of ONNX file Building an engine from file ./model.onnx; this may take a while... [TensorRT] ERROR: Network must have at least one output Completed creating Engine Traceback (most recent call last): File "deploy/onnx2trt.py", line 272, in <module> onnx2trt(args) File "deploy/onnx2trt.py", line 216, in onnx2trt engine = get_engine(args.max_batch_size, args.onnx_model_path, trt_engine_path, fp16_mode, int8_mode, save_engine=args.save_trt) File "deploy/onnx2trt.py", line 181, in get_engine return build_engine(max_batch_size, save_engine) File "deploy/onnx2trt.py", line 172, in build_engine f.write(engine.serialize()) AttributeError: 'NoneType' object has no attribute 'serialize'Then, inspired from this thread, I tried to visualize the onnx weights, and the final output is just there as expected.
Then I downgraded TensorRT 4.0, since it is compatible with product from PyTorch 0.4.0. After efforts of digging, I've managed to load the ONNX weights, but received another problem:
Loading ONNX file from path ./res/model.onnx...
Beginning ONNX file parsing
Completed parsing of ONNX file
Building an engine from file ./res/model.onnx; this may take a while...
input: "443"
input: "446"
output: "embed"
op_type: "Div"
attribute {
name: "broadcast"
i: 1
type: INT
}
attribute {
name: "axis"
i: 0
type: INT
}
doc_string: "/home/code/backbone.py(87): forward\n/home/.local/lib/python2.7/site-packages/torch/nn/modules/module.py(479): _slow_forward\n/home/.local/lib/python2.7/site-packages/torch/nn/modules/module.py(489): __call__\n/home/.local/lib/python2.7/site-packages/torch/jit/__init__.py(288): forward\n/home/.local/lib/python2.7/site-packages/torch/nn/modules/module.py(491): __call__\n/home/.local/lib/python2.7/site-packages/torch/jit/__init__.py(255): get_trace_graph\n/home/.local/lib/python2.7/site-packages/torch/onnx/utils.py(134): _export\n/home/.local/lib/python2.7/site-packages/torch/onnx/utils.py(84): export\n/home/.local/lib/python2.7/site-packages/torch/onnx/__init__.py(25): export\ndeploy/torch2onnx.py(77): torch2onnx\ndeploy/torch2onnx.py(86): <module>\n"
terminate called after throwing an instance of 'std::out_of_range'
what(): No converter registered for op type: Div
Aborted
The Div op lives in PyTorch 0.4.0, which is supposed to be converted by TensorRT, am I right?
@stoneyang - This seems like an issue with TensorRT and I can't speak to that. As a suggestion, I would recommend using latest PyTorch when working with ONNX exporter. That way you will have the latest updates.
@spandantiwari Oh, I see. I have already started to try other versions, and further info will be updated asap.
[TensorRT] ERROR: Network must have at least one output
[TensorRT] ERROR: Network validation failed.
Traceback (most recent call last):
File "run_onnx.py", line 81, in