ToduBem
ToduBem
Hi @Railcalibur, The error occurred because the first layer was set to use int8 precision. If you want to use fp16 input, please set the precision of the first layer...
Using cuDLA requires all layers can be supported by DLA, we moved several unsupported layers into post-processing, so that it won't use GPU resource in the runtime. Comparing to cuDLA...
Then it should be due to bandwidth-bound. DLA and GPU both consume the same resource: system DRAM. The more bandwidth-bound a workload is, the higher the chances that both DLA...
what chip do you use? is it a common question for TensorRT? could you new an issue in https://github.com/NVIDIA/TensorRT/issues?
closing since no activity for several months, thanks!
Sorry for the late reply, checked the source code of trtexec in branch 8.4, you may delete this [line](https://github.com/NVIDIA/TensorRT/blob/release/8.4/samples/common/sampleEngines.cpp#L951) and recompile the trtexec. The error said "kPREFER_PRECISION_CONSTRAINTS cannot be set...
-- Is the default tensor format for computation kDLA_HWC4? https://github.com/NVIDIA-AI-IOT/cuDLA-samples/blob/main/README.md#notes listed all DLA supported formats. It is recommended to use kDLA_HWC4 if the first layer is a convolution layer in...
closing since no activity for several months, thanks all!
We only recommend to measure the DLA task execution time using Nsight Systems.
closing since no activity for several months, thanks!