Possible incorrect sorting on Wave Size 128
See https://github.com/aras-p/UnityGaussianSplatting/issues/112
Possibly 64 as well as Qualcomm Adreno GPUs have a default subgroup size of 64.
As stated on the README, I have tested on wave size 64.
Was it on an AMD GPU? Or Qualcomm Adreno?
Tested on a 7900 XT with wave size locked to 64 using [WaveSize(<numLanes>)]. That being said, I have an Adreno 618 on the way, so I will be able to debug it then.
Welp thats extremely dissapointing.
I will have to get another device, problem is that I can't readily find information on which Qualcomm devices support WaveIntrinsics. The best I can find is this. I also don't have the resources at the moment to go out and buy a laptop just to test my code on. So for now this will have to be on hold.
I usually use gpuinfo to check subgroup support in Vulkan for Qualcomm devices. Subgroup ops correspond one-to-one to wave ops in most cases. I also have a Quest 3 for which subgroup size is 64 (not sure if it can be changed to 128). If you need help in testing, I'd be happy to help debug.
I usually use gpuinfo to check subgroup support in Vulkan for Qualcomm devices.
I totally forgot about that Vulkan has its own data base. In fact, if I remember correctly, the D3D12 one is a fork/based off of the Vulkan one.
I also have a Quest 3 for which subgroup size is 64 (not sure if it can be changed to 128). If you need help in testing, I'd be happy to help debug.
Do you know if it is possible to run this implementation on a Windows PC, using the Quest 3's Qualcomm chip as the device? The problem is that the D3D12 implementation probably will not run natively on the Quest 3. I've been meaning to make a Vulkan implementation, which shouldn't be too bad (DXC can compile to SPIR-V, so I would precompile the shaders from HLSL), but I just haven't had the time.
However, a user from another repo also has a Quest 3, and has offered to run tests for me, but in Unity. I believe that's the most straightforward course of action, because Unity handles the transpilation from HLSL to SPIR-V and can use Vulkan as the backend to run on Quest.
I very much appreciate the help though.
Do you know if it is possible to run this implementation on a Windows PC, using the Quest 3's Qualcomm chip as the device?
I doubt it, or at least I haven't seen it done before. It should be possible through Unity though. Looks like you're already on top of it!
You can try change waveFlags &= t ? ballot : ~ballot; to waveFlags &= (t ? ballot : (~ballot)); in function WarpLevelMultiSplitWGE16, this can show small Gaussian Spaltting model. @b0nes164
Closing this issue, as I have confirmed the bug does not have to do with the wave size. @LeeSYSU, copying your comment to #4.