DirectXShaderCompiler icon indicating copy to clipboard operation
DirectXShaderCompiler copied to clipboard

Intrinsics ddx_fine/ddy_fine should not be allowed to sink into flow control

Open hekota opened this issue 2 years ago • 2 comments

Description Intrinsics ddx_fine/ddy_fine should not be allowed to sink into flow control. That can currently happen because they are marked ReadNone same as other unary ops, but they are reading values from neighboring threads in the same quad.

Steps to Reproduce

  1. Open tools/clang/unittests/HLSLExec/ShaderOpArith.xml, find data for HelperLaneTestNoWave and remove -opt-disable sink from the shader compile arguments.
  2. Run hcttest exec-filter *HelperLaneTest -verbose -adapter NVidia*

Actual Behavior

StartGroup: ExecutionTest::HelperLaneTest
Verifying IsHelperLane in shader model 6.0
Using Adapter:NVIDIA GeForce GTX 1660 Ti
Error: Verify: AreEqual(pTestResults[0].is_helper_00, 0) - Values (1, 0) [File: D:\dxc2\tools\clang\unittests\HLSLExec\ExecutionTest.cpp, Function: ExecutionTest::HelperLaneTest, Line: 10939]
EndGroup: ExecutionTest::HelperLaneTest [Failed]

Expected Behavior

StartGroup: ExecutionTest::HelperLaneTest
Verifying IsHelperLane in shader model 6.0
Using Adapter:NVIDIA GeForce GTX 1660 Ti
Verifying IsHelperLane in shader model 6.6
Using Adapter:NVIDIA GeForce GTX 1660 Ti
EndGroup: ExecutionTest::HelperLaneTest [Passed]

More info

This shader code

            int is_helper_accross_X = ReadAcrossX_DD(is_helper, isLeft);
            int is_helper_accross_Y = ReadAcrossY_DD(is_helper, isTop);
            int is_helper_accross_Diag = ReadAcrossDiagonal_DD(is_helper, isLeft, isTop);

            if (!isLeft && !isTop) { //bottom right pixel writes results
              g_testResults[i].is_helper_00 = is_helper_accross_Diag;
              g_testResults[i].is_helper_10 = is_helper_accross_Y;
              g_testResults[i].is_helper_01 = is_helper_accross_X;
              g_testResults[i].is_helper_11 = is_helper;
            }

is optimized into code equivalent to

            int is_helper_accross_X = ReadAcrossX_DD(is_helper, isLeft);

            if (!isLeft && !isTop) { //bottom right pixel writes results
              int is_helper_accross_Y = ReadAcrossY_DD(is_helper, isTop);
              int is_helper_accross_Diag = ReadAcrossDiagonal_DD(is_helper, isLeft, isTop);
              g_testResults[i].is_helper_00 = is_helper_accross_Diag;
              g_testResults[i].is_helper_10 = is_helper_accross_Y;
              g_testResults[i].is_helper_01 = is_helper_accross_X;
              g_testResults[i].is_helper_11 = is_helper;
            }

The ReadAcrossDiagonal_DD function is reading a value from neighboring thread in the same quad. That value has not been calculated because this call and ReadAcrossY_DD got moved under the if flow control and did not execute on all threads in the quad.

Environment

  • DXC version 634c2f349 or later (main branch from July 28 forward, after HLSL 2021 was enabled by default).
  • Windows

hekota avatar Sep 18 '23 23:09 hekota

I am hitting this problem, is there any known workaround?

simondeschenes avatar Jan 07 '24 23:01 simondeschenes

The compile flag -opt-disable sink should help with that.

hekota avatar Jan 08 '24 18:01 hekota