Testing fails with segmentation fault
OS: Ubuntu 24.04, freshly provisioned Hardware: i7-1185G7, i9-12900 (reproduced on both) Level Zero version: 1.19.2 Setup: I am working to get the level-zero version in Ubuntu bumped to 1.19.2, which is why I'm installing from a PPA
sudo apt install -y build-essential cmake
sudo add-apt-repository ppa:mckeesh/testing
sudo apt -y update
sudo apt install libze1=1.19.2-0ubuntu1
git clone https://github.com/oneapi-src/level-zero
cd level-zero/
git checkout v1.19.2
mkdir build
cd build/
cmake -DBUILD_L0_LOADER_TESTS=yes ..
make -j`nproc`
./bin/tests
Result:
Running main() from /home/ubuntu/level-zero/build/_deps/googletest-src/googletest/src/gtest_main.cc
[==========] Running 15 tests from 3 test suites.
[----------] Global test environment set-up.
[----------] 1 test from LoaderAPI
[ RUN ] LoaderAPI.GivenLevelZeroLoaderPresentWhenCallingzeGetLoaderVersionsAPIThenValidVersionIsReturned
/home/ubuntu/level-zero/test/loader_api.cpp:26: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInit(0)
Which is: 2013265921
Found 1 versions
component.component_name: loader
component.component_lib_version.major: 1
component.spec_version: 65548
component.component_lib_name: loader
[ FAILED ] LoaderAPI.GivenLevelZeroLoaderPresentWhenCallingzeGetLoaderVersionsAPIThenValidVersionIsReturned (0 ms)
[----------] 1 test from LoaderAPI (0 ms total)
[----------] 11 tests from LoaderInit
[ RUN ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversWithTypesUnsupportedWithFailureThenSupportedTypesThenSuccessReturned
Segmentation fault (core dumped)
For Reference: https://dgpu-docs.intel.com/driver/client/overview.html
View the section on Ubuntu 24.04
Two different things happening here:
Error code 2013265921d is ZE_RESULT_ERROR_UNINITIALIZED = 0x78000001.. ( https://github.com/oneapi-src/level-zero/blob/1c0320bfdf0afe4b361e0297f9d10ac9dd6756fd/include/ze_api.h#L217 ) This indicates the L0 Loader cannot find a working Intel(R) GPU or NPU driver in the system, or those drivers failed to find a valid device to attach to, or are you running on a system with that didn't load a kernel driver. I see you posted the CPUs, both of which have valid HW, so this isn't the issue but are the KMD drivers actually running and is the user mode driver installed. )
First check to see if you've got a User Mode Driver installed and re-test $ sudo apt install libze-intel-gpu1 ( I am assuming your PPA has this package, if not install the one Intel is hosting )
If that doesn't work, check your dmesg to ensure that an Intel GPU is being loaded during boot with an i915.ko (potentially also xekmd.ko if you've got anything newer in a discrete slot).
The segfault, clearly shouldn't happen, but is likely a error induced by missing drivers. We should address the segfault within the L0 Loader to fix it
Hmm, interesting. I tried moving to Ubuntu 24.10 since we have been working more on that and have had validation teams check our stack. Here's what I'm seeing now:
ubuntu@hp-elite-mini-800-g9-desktop-pc-c29603:~/level-zero/build$ apt policy libze1
libze1:
Installed: 1.19.2.0-1076~24.10
Candidate: 1.19.2.0-1076~24.10
Version table:
*** 1.19.2.0-1076~24.10 500
500 https://ppa.launchpadcontent.net/kobuk-team/intel-graphics/ubuntu oracular/main amd64 Packages
100 /var/lib/dpkg/status
1.17.42-1 500
500 http://archive.ubuntu.com/ubuntu oracular/universe amd64 Packages
ubuntu@hp-elite-mini-800-g9-desktop-pc-c29603:~/level-zero/build$ apt policy libze-intel-gpu1
libze-intel-gpu1:
Installed: 24.52.32224.5-1~24.10~ppa2
Candidate: 24.52.32224.5-1~24.10~ppa2
Version table:
*** 24.52.32224.5-1~24.10~ppa2 500
500 https://ppa.launchpadcontent.net/kobuk-team/intel-graphics/ubuntu oracular/main amd64 Packages
100 /var/lib/dpkg/status
24.35.30872.24-1 500
500 http://archive.ubuntu.com/ubuntu oracular/universe amd64 Packages
ubuntu@hp-elite-mini-800-g9-desktop-pc-c29603:~/level-zero/build$ ./bin/zello_world
Driver not initialized: ZE_RESULT_ERROR_UNINITIALIZED
Did NOT find matching ZE_DEVICE_TYPE_GPU device!
Showing the same tests as before, I'm seeing more tests run after installing an updated libze-intel-gpu1 version, but there's still the segfaulting and uninitialized issues"
Running main() from /home/ubuntu/level-zero/build/_deps/googletest-src/googletest/src/gtest_main.cc
[==========] Running 15 tests from 3 test suites.
[----------] Global test environment set-up.
[----------] 1 test from LoaderAPI
[ RUN ] LoaderAPI.GivenLevelZeroLoaderPresentWhenCallingzeGetLoaderVersionsAPIThenValidVersionIsReturned
/home/ubuntu/level-zero/test/loader_api.cpp:26: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInit(0)
Which is: 2013265921
Found 2 versions
component.component_name: loader
component.component_lib_version.major: 1
component.spec_version: 65547
component.component_lib_name: loader
component.component_name: tracing layer
component.component_lib_version.major: 1
component.spec_version: 65547
component.component_lib_name: tracing layer
[ FAILED ] LoaderAPI.GivenLevelZeroLoaderPresentWhenCallingzeGetLoaderVersionsAPIThenValidVersionIsReturned (24 ms)
[----------] 1 test from LoaderAPI (24 ms total)
[----------] 11 tests from LoaderInit
[ RUN ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversWithTypesUnsupportedWithFailureThenSupportedTypesThenSuccessReturned
/home/ubuntu/level-zero/test/loader_api.cpp:64: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:65: Failure
Expected: (pCount) > (0), actual: 0 vs 0
[ FAILED ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversWithTypesUnsupportedWithFailureThenSupportedTypesThenSuccessReturned (0 ms)
[ RUN ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversWithGPUTypeThenExpectPassWithGPUorAllOnly
/home/ubuntu/level-zero/test/loader_api.cpp:77: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:78: Failure
Expected: (pCount) > (0), actual: 0 vs 0
/home/ubuntu/level-zero/test/loader_api.cpp:81: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:82: Failure
Expected: (pCount) > (0), actual: 0 vs 0
/home/ubuntu/level-zero/test/loader_api.cpp:85: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:86: Failure
Expected: (pCount) > (0), actual: 0 vs 0
[ FAILED ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversWithGPUTypeThenExpectPassWithGPUorAllOnly (0 ms)
[ RUN ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversWithNPUTypeThenExpectPassWithNPUorAllOnly
/home/ubuntu/level-zero/test/loader_api.cpp:98: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:99: Failure
Expected: (pCount) > (0), actual: 0 vs 0
/home/ubuntu/level-zero/test/loader_api.cpp:102: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:103: Failure
Expected: (pCount) > (0), actual: 0 vs 0
/home/ubuntu/level-zero/test/loader_api.cpp:106: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:107: Failure
Expected: (pCount) > (0), actual: 0 vs 0
[ FAILED ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversWithNPUTypeThenExpectPassWithNPUorAllOnly (0 ms)
[ RUN ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversWithAnyTypeWithNullDriverAcceptingAllThenExpectatLeast1Driver
/home/ubuntu/level-zero/test/loader_api.cpp:119: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:120: Failure
Expected: (pCount) > (0), actual: 0 vs 0
/home/ubuntu/level-zero/test/loader_api.cpp:123: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:124: Failure
Expected: (pCount) > (0), actual: 0 vs 0
/home/ubuntu/level-zero/test/loader_api.cpp:127: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:128: Failure
Expected: (pCount) > (0), actual: 0 vs 0
/home/ubuntu/level-zero/test/loader_api.cpp:131: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:132: Failure
Expected: (pCount) > (0), actual: 0 vs 0
[ FAILED ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversWithAnyTypeWithNullDriverAcceptingAllThenExpectatLeast1Driver (0 ms)
[ RUN ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversThenzeInitThenBothCallsSucceedWithAllTypes
/home/ubuntu/level-zero/test/loader_api.cpp:145: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pInitDriversCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:146: Failure
Expected: (pInitDriversCount) > (0), actual: 0 vs 0
/home/ubuntu/level-zero/test/loader_api.cpp:147: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInit(0)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:149: Failure
Expected: (pDriverGetCount) > (0), actual: 0 vs 0
[ FAILED ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversThenzeInitThenBothCallsSucceedWithAllTypes (0 ms)
[ RUN ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversThenzeInitThenBothCallsSucceedWithGPUTypes
/home/ubuntu/level-zero/test/loader_api.cpp:162: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pInitDriversCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:163: Failure
Expected: (pInitDriversCount) > (0), actual: 0 vs 0
/home/ubuntu/level-zero/test/loader_api.cpp:164: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInit(ZE_INIT_FLAG_GPU_ONLY)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:166: Failure
Expected: (pDriverGetCount) > (0), actual: 0 vs 0
[ FAILED ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversThenzeInitThenBothCallsSucceedWithGPUTypes (0 ms)
[ RUN ] LoaderInit.GivenZeInitDriversUnsupportedOnTheDriverWhenCallingZeInitDriversThenUninitializedReturned
/home/ubuntu/level-zero/test/loader_api.cpp:181: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInit(0)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:183: Failure
Expected: (pDriverGetCount) > (0), actual: 0 vs 0
[ FAILED ] LoaderInit.GivenZeInitDriversUnsupportedOnTheDriverWhenCallingZeInitDriversThenUninitializedReturned (0 ms)
[ RUN ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversThenzeInitThenBothCallsSucceedWithNPUTypes
/home/ubuntu/level-zero/test/loader_api.cpp:196: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pInitDriversCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:197: Failure
Expected: (pInitDriversCount) > (0), actual: 0 vs 0
/home/ubuntu/level-zero/test/loader_api.cpp:198: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInit(ZE_INIT_FLAG_VPU_ONLY)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:200: Failure
Expected: (pDriverGetCount) > (0), actual: 0 vs 0
[ FAILED ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingZeInitDriversThenzeInitThenBothCallsSucceedWithNPUTypes (0 ms)
[ RUN ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingzeInitThenZeInitDriversThenBothCallsSucceedWithAllTypes
/home/ubuntu/level-zero/test/loader_api.cpp:213: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInit(0)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:214: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pInitDriversCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:215: Failure
Expected: (pInitDriversCount) > (0), actual: 0 vs 0
/home/ubuntu/level-zero/test/loader_api.cpp:217: Failure
Expected: (pDriverGetCount) > (0), actual: 0 vs 0
[ FAILED ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingzeInitThenZeInitDriversThenBothCallsSucceedWithAllTypes (0 ms)
[ RUN ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingzeInitThenZeInitDriversThenBothCallsSucceedWithGPUTypes
/home/ubuntu/level-zero/test/loader_api.cpp:230: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInit(ZE_INIT_FLAG_GPU_ONLY)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:231: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pInitDriversCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:232: Failure
Expected: (pInitDriversCount) > (0), actual: 0 vs 0
/home/ubuntu/level-zero/test/loader_api.cpp:234: Failure
Expected: (pDriverGetCount) > (0), actual: 0 vs 0
[ FAILED ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingzeInitThenZeInitDriversThenBothCallsSucceedWithGPUTypes (0 ms)
[ RUN ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingzeInitThenZeInitDriversThenBothCallsSucceedWithNPUTypes
/home/ubuntu/level-zero/test/loader_api.cpp:247: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInit(ZE_INIT_FLAG_VPU_ONLY)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:248: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pInitDriversCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_api.cpp:249: Failure
Expected: (pInitDriversCount) > (0), actual: 0 vs 0
/home/ubuntu/level-zero/test/loader_api.cpp:251: Failure
Expected: (pDriverGetCount) > (0), actual: 0 vs 0
[ FAILED ] LoaderInit.GivenLevelZeroLoaderPresentWhenCallingzeInitThenZeInitDriversThenBothCallsSucceedWithNPUTypes (0 ms)
[----------] 11 tests from LoaderInit (0 ms total)
[----------] 3 tests from LoaderValidation
[ RUN ] LoaderValidation.GivenLevelZeroLoaderPresentWhenCallingzeCommandListAppendMemoryCopyWithCircularDependencyOnEventsThenValidationLayerPrintsWarningOfDeadlock
/home/ubuntu/level-zero/test/loader_validation_layer.cpp:28: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInit(ZE_INIT_FLAG_GPU_ONLY)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_validation_layer.cpp:29: Failure
Expected equality of these values:
ZE_RESULT_SUCCESS
Which is: 0
zeInitDrivers(&pCount, nullptr, &desc)
Which is: 2013265921
/home/ubuntu/level-zero/test/loader_validation_layer.cpp:30: Failure
Expected: (pCount) > (0), actual: 0 vs 0
/home/ubuntu/level-zero/test/loader_validation_layer.cpp:72: Failure
Expected: (pDevice) != (nullptr), actual: NULL vs (nullptr)
Segmentation fault (core dumped)
Alright, I did 2 things to improve my situation:
- Installed the NPU UMD, which provided
/usr/lib/x86_64-linux-gnu/libze_intel_vpu.so.1 - Ran the tests as root. This was only obviously necessary once strace said it was failing to access
/usr/lib/x86_64-linux-gnu/libze_intel_vpu.so.1
After that, the tests seemed to be getting stuck, so I did some debugging and found that it was hanging on a call to zeCommandQueueSynchronize. I made the following change to get past that by reducing the timeout:
diff --git a/test/loader_validation_layer.cpp b/test/loader_validation_layer.cpp
index d04f795..2138d90 100644
--- a/test/loader_validation_layer.cpp
+++ b/test/loader_validation_layer.cpp
@@ -168,26 +180,36 @@ TEST(
status = zeCommandQueueCreate(context, pDevice, &command_queue_description, &command_queue);
EXPECT_EQ(ZE_RESULT_SUCCESS, status);
+ std::cout << "lvl 6" << std::endl;
+
status = zeCommandQueueExecuteCommandLists(command_queue, 1, &command_list, nullptr);
EXPECT_EQ(ZE_RESULT_SUCCESS, status);
+ std::cout << "lvl 6.1" << std::endl;
- status = zeCommandQueueSynchronize(command_queue, UINT64_MAX);
+ status = zeCommandQueueSynchronize(command_queue, 10000000000);
EXPECT_EQ(ZE_RESULT_SUCCESS, status);
From there, I was able to get through all the tests on i9-12900.
However, on Core Ultra 7 268V, I'm still getting a test hang here in ze_libapi.cpp during the test LoaderValidation.GivenLevelZeroLoaderPresentWhenCallingzeCommandListAppendMemoryCopyWithCircularDependencyOnEventsThenValidationLayerPrintsWarningOfDeadlock:
ze_result_t ZE_APICALL
zeMemFree(
ze_context_handle_t hContext, ///< [in] handle of the context object
void* ptr ///< [in][release] pointer to memory to free
)
{
if(ze_lib::context->inTeardown) {
return ZE_RESULT_ERROR_UNINITIALIZED;
}
auto pfnFree = ze_lib::context->zeDdiTable.load()->Mem.pfnFree;
if( nullptr == pfnFree ) {
if(!ze_lib::context->isInitialized)
return ZE_RESULT_ERROR_UNINITIALIZED;
else
return ZE_RESULT_ERROR_UNSUPPORTED_FEATURE;
}
return pfnFree( hContext, ptr ); /////////////// GETS STUCK ///////////////////
}