GPU-Reshape icon indicating copy to clipboard operation
GPU-Reshape copied to clipboard

DX12 application Crashes when discovery is started

Open Atrix256 opened this issue 1 year ago • 5 comments

I am very excited about GPU reshape and look forward to using it to find problems in an open sourced rapid prototyping / development / research application called Gigi that we use at Electronic Arts: https://github.com/electronicarts/gigi/

When I turn on discovery, then run GigiViewerDX12 (in debug), this call leaves g_pd3dInfoQueue null. The device is a ID3D12Device2*, and the info queue is a ID3D12InfoQueue*. g_pd3dDevice->QueryInterface(IID_PPV_ARGS(&g_pd3dInfoQueue));

The crash is due to a lack of a null check, but adding a null check then results in a crash in a CreateRenderTargetView call later on (in debug and release both)

Exception thrown at 0x00007FFBF81EFA4C in GigiViewerDX12.exe: Microsoft C++ exception: std::out_of_range at memory location 0x00000091450FEC00.
 	ucrtbase.dll!00007ffbf88df6fe()	Unknown
 	ucrtbase.dll!00007ffbf88dee19()	Unknown
 	vcruntime140_1.dll!00007ffbda191ab1()	Unknown
 	vcruntime140_1.dll!00007ffbda19232f()	Unknown
 	vcruntime140_1.dll!00007ffbda192389()	Unknown
 	vcruntime140_1.dll!00007ffbda194189()	Unknown
 	ntdll.dll!00007ffbfaf3527f()	Unknown
 	ntdll.dll!00007ffbfaeae886()	Unknown
 	ntdll.dll!00007ffbfaf3426e()	Unknown
 	KernelBase.dll!00007ffbf81efa4c()	Unknown
 	vcruntime140.dll!00007ffbd9f66ba7()	Unknown
 	msvcp140.dll!00007ffbbbd09542()	Unknown
 	GRS.Backends.DX12.Layer.dll!00007ffab35e3520()	Unknown
 	GRS.Backends.DX12.Layer.dll!00007ffab35e0b01()	Unknown
 	GRS.Backends.DX12.Layer.dll!00007ffab35e2260()	Unknown
 	WinPixGpuCapturer.dll!00007ffab447c8a0()	Unknown
>	GigiViewerDX12.exe!CreateRenderTarget() Line 7852	C++
 	GigiViewerDX12.exe!WndProc(HWND__ * hWnd, unsigned int msg, unsigned __int64 wParam, __int64 lParam) Line 7926	C++
 	[External Code]	
 	GigiViewerDX12.exe!main(int argc, char * * argv) Line 7359	C++
 	[External Code]	

I've tried updating the Agility SDK to the latest version, but that didn't affect anything. I've also updated my video driver, and ensured all windows updates were applied, and rebooted. I've tried other things as well, but no luck.

My code is open sourced on github at the repo I linked, so you can see this for yourself, but please also let me know if there's any more information I can give, or anything you'd like me to try.

Thanks!

Atrix256 avatar Nov 25 '24 18:11 Atrix256

Hi Alan!

Big fan of Gigi, happy to fix the above!

I've noticed the info queue issue before, I think it relates to how I wrap interface queries (on both devices and the global one). On the render target one, that's new, I'll try to reproduce tomorrow and get a fix out for you.

Just in case it matters, what's your GPU?

miguel-petersen avatar Nov 25 '24 22:11 miguel-petersen

Thanks a lot, I appreciate it! It's a geforce RTX 4090 and I'm on driver 32.0.15.6614.

Atrix256 avatar Nov 25 '24 22:11 Atrix256

Still looking into the info queue.

On the render target crash, it's related to how I interface with WinPixGpuCapturer, in some call stacks the call stack is Gigi -> Reshape -> WinPixGpuCapturer, and in others Gigi -> WinPixGpuCapturer -> Reshape. If I disable WinPixGpuCapturer it goes through just fine.

Below I instrumented the DOF sample. I've noticed that it really struggles to resolve the line of code it belongs to (I hacked in debug info to CompileShaderToByteCode_Private.

image

I have issues in other places with hook/DLL ordering, and a couple of workarounds. Right now I rely on detouring by patching the functions themselves, but the real approach, which would avoid this, is to modify the Import Address Table instead. But that's a scarier change for the future.

miguel-petersen avatar Nov 26 '24 21:11 miguel-petersen

So, there's kind of three issues here:

  1. Debug info queues (not sure what's up yet)
  2. Hooking call order
  3. Source resolving

To clarify on the last one, I can make it resolve the source code itself, but not the line of code it ends up at, for some reason.

miguel-petersen avatar Nov 26 '24 21:11 miguel-petersen

Oh excellent! When disabling pix GPU capturing, I'm able to launch, run the unit tests, and see all sorts of nice errors and warnings. I believe I'm functional now, if you want to close this out and leave those other items on the future TODO list. Thank you so much!

Atrix256 avatar Nov 26 '24 23:11 Atrix256