TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

🐛 [Bug] aten::_convolution output shape no longer matches Pytorch after TensorRT 10 upgrade

Open matthewfl opened this issue 1 year ago • 4 comments

Bug Description

The output shape of aten::_convolution no longer matches pytorch after the TensorRT 10 upgrade.

I have noticed that the output shape is correct when I pass in the weight matrix as a constant instead of as a input as done in the reproducer below.

To Reproduce

import torch, torch_tensorrt

class Model(torch.nn.Module):
    def forward(self, x,y):
        return torch.ops.aten._convolution(x, y, None, [1,1], [1,1], [1,1],False,[0,0],144,True,True,True,True)

model = Model().cuda()
input = torch.rand(20,144,55,55).cuda(), torch.rand(144,1,3,3).cuda()

compiled = torch_tensorrt.ts.compile(torch.jit.trace(model,input),input)

assert compiled(*input).shape == model(*input).shape

Expected behavior

The program should work without the assert failing.

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • Torch-TensorRT Version (e.g. 1.0.0): 2.4.0
  • PyTorch Version (e.g. 1.0): 2.4.1
  • CPU Architecture:
  • OS (e.g., Linux): Linux
  • How you installed PyTorch (conda, pip, libtorch, source):
  • Build command you used (if compiling from source):
  • Are you using local sources or building from archives:
  • Python version:
  • CUDA version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context

matthewfl avatar Sep 24 '24 19:09 matthewfl

Hi @zewenli98, is there any update on this?

matthewfl avatar Oct 01 '24 13:10 matthewfl

@matthewfl It seems to be the issue of TS path that had no longer been supported. Can you try replacing

compiled = torch_tensorrt.ts.compile(torch.jit.trace(model,input),input)

with

exp_program = torch.export.export(model, input)
compiled = torch_tensorrt.dynamo.compile(exp_program, input, min_block_size=1)

zewenli98 avatar Oct 01 '24 19:10 zewenli98

seems like this issue was related to the output/post padding being handled differently between the two code paths https://github.com/pytorch/TensorRT/blob/main/core/conversion/converters/impl/conv_deconv.cpp#L201

matthewfl avatar Oct 18 '24 18:10 matthewfl

seems like this issue was related to the output/post padding being handled differently between the two code paths https://github.com/pytorch/TensorRT/blob/main/core/conversion/converters/impl/conv_deconv.cpp#L201

right. dynamo path uses pre-padding by default.

zewenli98 avatar Oct 18 '24 18:10 zewenli98