lanluo-nvidia

Results 40 comments of lanluo-nvidia

Just saw a new update from torch decomp upstream, We need add that also [from torch._export.utils import _decomp_table_to_post_autograd_aten](https://github.com/pytorch/pytorch/commit/1f32a1fb80e0c1826ad931832b93d4f82c3ecd98#diff-4a060a24e1f81389eab7390d434dddf919af50aa8fda6cbd81e182a53bd9328eR25)

@orioninthesky98 I have tried the example in the current latest main and our upcoming 2.5.0 release, both are working as expected. I think the batchnorm3d bug has been fixed. Also...

when I locally test it I got the following error on https://github.com/pytorch/TensorRT/blob/main/py/torch_tensorrt/dynamo/runtime/_TorchTensorRTModule.py#L150: `torch_tensorrt [TensorRT Conversion Context]:logging.py:24 IRuntime::deserializeCudaEngine: Error Code 1: Serialization (Serialization assertion plan->header.pad == expectedPlatformTag failed.Platform specific tag mismatch...

This is what @peri044 has replied in the slack: this is because we are running shape analysis during the save call which expects the engines to be setup. We can...

@MaltoseFlower do you have any example code which I can look into this further?

meanwhile here is an example of the fp8/int8 PTQ for your reference: https://github.com/pytorch/TensorRT/blob/main/examples/dynamo/vgg16_ptq.py

I have verified on my linux machine with the same torch/torch_tensorrt/cuda/python version, it is not working, it first pop up the error: `Potential NSFW content was detected in one or...

@pangyoki Currently I am facing a CUDA out of memory error when try to run this model in my RTX4080 which has 16G of GPU memory. I will verify on...

I have verified from the latest main branch that I am not able to generate correct image from "stabilityai/stable-diffusion-3-medium-diffusers" model, it gives me a nonsense image. however I can successfully...