Add GpuSharedMemory Tests for Intel GPU
Problem We need specific tests for Shared Memory on Intel GPU. Currently we have only shared memory test for Nvidia GPU with CUDA.
Solution This PR adds the ./Tests/GpuSharedMemoryTestIntel executable. I took the GpuSharedMemoryTest.cpp code and adapted it from Nvidia CUDA to L0
Impact It will allows us to test our SM implementation, work in progress https://github.com/intel-ai/hdk/pull/534
This mostly the same as the original smem test, right? The only real difference is the kernel submission. Can we get rid of code duplication for all the codegen and helpers methods?
- Fixed style-check for CI
- Need to fix ./Tests/GpuSharedMemoryTestIntel
Current issue with GpuSharedMemoryTestIntel executable:
$ ./Tests/GpuSharedMemoryTestIntel
[==========] Running 7 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 7 tests from SingleColumn
[ RUN ] SingleColumn.VariableEntries_CountQuery_4B_Group
Adding global slm.buf.i64
PerformReductionTest - after linking
PerformReductionTest - before calling conv
PerformReductionTest - after calling conv
PerformReductionTest - before linking
compile_and_link_gpu_code - before writeSpirv
compile_and_link_gpu_code - before spv_to_bin
unknown file: Failure
C++ exception with description "L0 error: error occurred when building module, see build log for details" thrown in the test body.
[ FAILED ] SingleColumn.VariableEntries_CountQuery_4B_Group (70 ms)
hdk_log:
2023-07-05T10:12:06.622687 I 3915859 0 0 L0Mgr.cpp:76 Discovered 1 driver(s) for L0 platform.
2023-07-05T10:12:06.661611 I 3915859 0 0 L0Mgr.cpp:229 L0 module build log: error: IGC SPIRV Consumer does not support the following opcode: 65534
^@
IR: https://gist.github.com/lmontigny/3663842110065557e18df09ea5f658d5
@kurapov-peter Can you have a look the IR & GpuSharedMemoryTestIntel.cpp around init_smem_func L134? I took the GpuSharedMemoryTest.cpp (cuda) and modified to support L0 - I'm not sure it's 100% correct.
I see that none of the functions except for the wrapper_kernel have a proper calling convention set. If that's the final representation of the kernel you convert to spirv I'd assume the error you are seeing is related, e.g. it tries to execute generate a call instruction with the default calling conv and doesn't find such in the isa.
Another thing to check is that your group-by buffer size does not produce indexes that are out of the allocated SLM.
The last thing that caught my eye is that there's a declaration of the init_shared_mem that doesn't match the definition, so you might be calling a non-existent function.
Oh, and I think init_smem_func uses global address space instead of the shared one. This is the most probable reason you get the error.
Previous Spirv IGC opcode issue fixed internally.
Now looking at C++ exception with description "L0 error: kernel name is not found in the module" thrown in the test body
The last thing that caught my eye is that there's a declaration of the
init_shared_memthat doesn't match the definition, so you might be calling a non-existent function.
In the latest IR with the IGC fix, I don't have any declaration for init_shared_mem, is it normal? attached IR
Fixed large portion of the codegen at different level, executable now running.
Next step is to fix the computation to have correct results. (currently cpu != gpu results)
hdk/omniscidb/Tests/GpuSharedMemoryTestIntel.cpp:416: Failure Expected equality of these values: cmp_result Which is: 1 0 [ FAILED ] SingleColumn.VariableEntries_CountQuery_4B_Group (247 ms) [----------] 1 test from SingleColumn (247 ms total)