Ismayil Ismayilov

Results 10 comments of Ismayil Ismayilov

I am facing the same error on `torch 2.5.0+cu124`. The error is preceded by the following warning: ``` cuDNN SDPA backward got grad_output.strides() != output.strides() ``` I'm on an H100,...

Seems to be NVIDIA/cudnn-frontend#75 and https://github.com/NVIDIA/cudnn-frontend/issues/78

@lanluo-nvidia Thank you for the reply. That is very strange. I will try with today's nightly and report back. Also, I am running this on an H100, could that possibly...

I tried again with today's nightly (`torch_tensorrt==2.5.0.dev20240918+cu124`, `torch==dev20240912+cu124`) and I am encountering the same runtime error. Additionally, the results for the compiled UNet match the original UNet. At this point,...

I also tried with release 2.4. There, I can successfully save and load the model but the compiled model outputs are full of nans. In general, Stable Diffusion with Torch-TensorRT...

@lanluo-nvidia After loading the UNet, I first check if the results match (`expected_outputs_unet` is defined in the previous code block) ```py with torch.inference_mode(): tensorrt_outputs_unet = loaded_unet(*arg_inputs_unet) for expected_output, tensorrt_output in...

Yes, the error in Test 3) is exactly what I'm getting on my H100. I thought the problem might be with torch.export so I already created an issue on the...

@lanluo-nvidia Also, on my H100 tests, the model successfully compiles, the UNet results match (using `compiled_unet` directly) and I can generate an image (if I use `compiled_unet` in place of...

@lanluo-nvidia Any updates on this? Should I expect this issue to be resolved soon or will this be on the backlog for a while? Unfortunately, I only have H100s at...

@HolyWu Thanks for the suggestion. I hadn't looked at this for a while, I just tried my original code (torch_tensorrt.save, torch.export.load) on torch 2.5.1, torch_tensorrt 2.5.0. Everything seems to be...