TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

🐛 [Bug] aten::add does not convert with dynamic batch

Open narendasan opened this issue 4 years ago • 7 comments

Bug Description

When using dynamic batch, aten::add dimension checks fail reporting:

ERROR: [TRTorch Conversion Context] - %457 : Tensor = aten::add_(%456, %455, %3): broadcast dimensions must be non-negative
ERROR: [TRTorch Conversion Context] - Builder failed while analyzing shapes.

To Reproduce

Steps to reproduce the behavior:

  1. Run trtorchc on ResNet50 with dynamic batch size
  2. trtorchc tests/modules/resnet50_scripted.jit.pt /out/file.trt "[(1,3,224,224);(3,3,224,224);(5,3,224,224)]" -v

Expected behavior

Model converts properly

Environment

Build information about the TRTorch compiler can be found by turning on debug messages

  • PyTorch Version (e.g., 1.0): 1.7.1
  • CPU Architecture: x86
  • OS (e.g., Linux): Linux
  • How you installed PyTorch (conda, pip, libtorch, source): pip
  • Build command you used (if compiling from source): source [00f2d78b2768c7f65bffcff6ba0aa287838b2ca0]
  • Are you using local sources or building from archives: archives
  • Python version: python 3.6
  • CUDA version: 11.0
  • GPU models and configuration: TITAN V
  • Any other relevant information:

Additional context

log.txt

narendasan avatar Feb 23 '21 23:02 narendasan

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar May 31 '21 00:05 github-actions[bot]

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Aug 31 '21 00:08 github-actions[bot]

cc: @apbose

narendasan avatar May 20 '22 01:05 narendasan

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Aug 19 '22 00:08 github-actions[bot]

@apbose any updates on this issue?

narendasan avatar Aug 19 '22 00:08 narendasan

Hi @narendasan I tested this locally and the test passes now.

apbose avatar Aug 22 '22 17:08 apbose

Do we have a dynamic batch aten::add test?

narendasan avatar Aug 22 '22 17:08 narendasan

bump @apbose to Naren's question

ncomly-nvidia avatar Nov 15 '22 19:11 ncomly-nvidia

I am not sure exactly, but should not these test cases be included in the test_dynamic_fallback.cpp in the TensorRT/tests/cpp tests?

apbose avatar Dec 01 '22 07:12 apbose

test_dynamic_fallback has testcases where the model has dynamic shapes and ops that fallback. This issue is quite old and is related to dynamic shape aten::add not being converted properly. There are dynamic batch test cases for aten::add now. https://github.com/pytorch/TensorRT/blob/master/tests/core/conversion/converters/test_element_wise.cpp#L69-L70 We probably don't have an end-end dynamic batch model (full_compilation=True) testcase.

peri044 avatar Dec 01 '22 10:12 peri044

This now working, it just needs to be tested.

Christina-Young-NVIDIA avatar Dec 20 '22 01:12 Christina-Young-NVIDIA

Closing the issue as it has been tested.

apbose avatar Jan 13 '23 16:01 apbose