TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

[Converter] Add support for group_norm

Open tommyyliu opened this issue 3 years ago • 9 comments

Could we get support for aten::group_norm?

Original PyTorch API: https://pytorch.org/docs/stable/generated/torch.nn.GroupNorm.html

tommyyliu avatar Feb 17 '22 20:02 tommyyliu

@peri044 do we need a plugin for group norm?

narendasan avatar Feb 17 '22 21:02 narendasan

Yes, we need a plugin. Group norm plugin is shipped with TensorRT package (as a part of libnvinfer_plugin.so). The source code is available here https://github.com/NVIDIA/TensorRT/tree/main/plugin/groupNormalizationPlugin.

An implementation would look like this We implement a converter in Torch-TensorRT for group norm which calls the GN plugin in TensorRT as follows :

auto creator = getPluginRegistry()->getPluginCreator("GroupNormalizationPlugin", "1", "torch_tensorrt");
auto group_norm_plugin = creator->createPlugin(name, &fc); // fc is the collection of parameters passed to the plugin.

an example for reference : https://github.com/NVIDIA/Torch-TensorRT/blob/master/core/conversion/converters/impl/interpolate.cpp#L56

peri044 avatar Feb 17 '22 21:02 peri044

Any updates on this issue? It would be helpful for my use case too.

gadgetsam avatar Mar 29 '22 21:03 gadgetsam

Hi all, I have a first version for the the group_norm layer converter using the GN plugin in TensorRT.

I add a group_norm.cpp file that wrap the correct pytorch signature. I also add to ignore the batch size checking that lead to unsupported operator used for the check...

Now the compilation/conversion from a pytorch model is ok but when I run an inference of the model (typically by modifying the examples/network.py file) lead to a cudnn error : ..... WARNING: [Torch-TensorRT] - Group norm layer is an experimental development features and used the group_norm plugin from TensorRT plugins library WARNING: [Torch-TensorRT] - Create group norm plugin from TensorRT plugin registry... WARNING: [Torch-TensorRT] - Get the creator for group norm WARNING: [Torch-TensorRT] - Create plugin WARNING: [Torch-TensorRT] - Add plugins to the context Warm up ... ERROR: [Torch-TensorRT] - 2: [pluginV2DynamicExtRunner.cpp::execute::115] Error Code 2: Internal Error (Assertion status == kSTATUS_SUCCESS failed. ) ERROR: [Torch-TensorRT] - 1: [context.cpp::setStream::121] Error Code 1: Cudnn (CUDNN_STATUS_MAPPING_ERROR) ERROR: [Torch-TensorRT] - 1: [context.cpp::setStream::121] Error Code 1: Cudnn (CUDNN_STATUS_MAPPING_ERROR) ERROR: [Torch-TensorRT] - 1: [context.cpp::setStream::121] Error Code 1: Cudnn (CUDNN_STATUS_MAPPING_ERROR)

...

Traceback (most recent call last): File "network.py", line 108, in main() File "network.py", line 100, in main benchmark(trt_ts_module, input_shape=(1, 3, 5, 5), dtype="fp32") File "network.py", line 42, in benchmark torch.cuda.synchronize() File "/opt/conda/lib/python3.8/site-packages/torch/cuda/init.py", line 494, in synchronize return torch._C._cuda_synchronize() RuntimeError: CUDA error: an illegal memory access was encountered ERROR: [Torch-TensorRT] - 1: [fusedConvActRunner.cpp::destroyFilterTexture::292] Error Code 1: Cuda Runtime (an illegal memory access was encountered) ERROR: [Torch-TensorRT] - 1: [defaultAllocator.cpp::deallocate::35] Error Code 1: Cuda Runtime (an illegal memory access was encountered) ERROR: [Torch-TensorRT] - 1: [cudaResources.cpp::~ScopedCudaStream::47] Error Code 1: Cuda Runtime (an illegal memory access was encountered) ERROR: [Torch-TensorRT] - 1: [cudaResources.cpp::~ScopedCudaEvent::24] Error Code 1: Cuda Runtime (an illegal memory access was encountered) ERROR: [Torch-TensorRT] - 1: [cudaResources.cpp::~ScopedCudaEvent::24] Error Code 1: Cuda Runtime (an illegal memory access was encountered) ERROR: [Torch-TensorRT] - 1: [cudaResources.cpp::~ScopedCudaEvent::24] Error Code 1: Cuda Runtime (an illegal memory access was encountered) ERROR: [Torch-TensorRT] - 1: [cudaResources.cpp::~ScopedCudaEvent::24] Error Code 1: Cuda Runtime (an illegal memory access was encountered)

Here is my new version of the network :

class ConvGelu(torch.nn.Module): def init(self): super(ConvGelu, self).init() self.conv = nn.Conv2d(3, 32, 3, 1) self.gelu = nn.GELU()

def forward(self, x):
    x = self.conv(x)
    x = F.group_norm(x,num_groups=32)
    x = self.gelu(x)
    return x

I'm not sure if there is a problem with cudnn call from the GroupNorm plugin layer with is mostly based around the cuDNN BatchNorm function.

Any feedback ?

Cheers,

David

david-PHR avatar Apr 11 '22 17:04 david-PHR

Hi David, I'm also tying to implement a converter, can you tell me how you manage to ignore the batch size check?

timohueser avatar Apr 14 '22 12:04 timohueser

I have patch the group_norm from PyTorch without the check of the batch size, but it seems that with the ngc21.12 there is no need to do that

david-PHR avatar Apr 15 '22 09:04 david-PHR

I see. I'll do that as well for now then. How do you manage to pass the weight and bias argument to the GroupNormPlugin? I can't find a way to convert the Tensors to ITensors.

timohueser avatar Apr 15 '22 19:04 timohueser

The GroupNormPlugin provided by TensorRT doesn't support the fine-tuned weights and biases. I bypass this missing feature with the addScaleNd TensorRT layer that I plugged after the call of the GroupNormalizationPlugin

david-PHR avatar Apr 19 '22 15:04 david-PHR

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Aug 17 '22 00:08 github-actions[bot]

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Nov 21 '22 00:11 github-actions[bot]

GroupNormalizationPlugin

Has the development of the plugin been completed? Can you pull request or share the code?Thanks!

kisisjrlly avatar Nov 24 '22 02:11 kisisjrlly

Any updates on this? Would also be useful for my model.

patrickwilliams3 avatar Dec 21 '22 04:12 patrickwilliams3

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Mar 22 '23 00:03 github-actions[bot]

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Jun 21 '23 00:06 github-actions[bot]