Games crash on launch w/ NVIDIA and AMD drivers/hardware loaded/enabled at the same time.
I'm using VFIO for the occasional incompatible windows game. All games seem to not complete startup w/ Proton 5.13+ (tried 7.x, experimental, etc) whenever the NVIDIA card is bound to the host. My main display is being run off of the AMD iGPU and I'm launching w/ prime-run steam. This problem manifests with or without prime-run though. If I unbind the NVIDIA card proton runs fine. Versions of proton < 5.13 also run fine.
Issue seems similar to https://github.com/ValveSoftware/Proton/issues/6180
I'm using Arch Linux, Ryzen 5700G, nVidia 3070
Logs attached:
slr-app837470-t20221007T121808.log steam-837470.log sysinfo.log
Console log:
/bin/sh\0-c\0PROTON_LOG=1 /home/john/.local/share/Steam/ubuntu12_32/reaper SteamLaunch AppId=837470 -- /home/john/.local/share/Steam/ubuntu12_32/steam-launch-wrapper -- '/home/john/.local/share/Steam/steamapps/common/SteamLinuxRuntime_soldier'/_v2-entry-point --verb=waitforexitandrun -- '/home/john/.local/share/Steam/steamapps/common/Proton 5.13'/proton waitforexitandrun '/home/john/.local/share/Steam/steamapps/common/Untitled Goose Game/Untitled.exe'\0
Game process added : AppID 837470 "PROTON_LOG=1 /home/john/.local/share/Steam/ubuntu12_32/reaper SteamLaunch AppId=837470 -- /home/john/.local/share/Steam/ubuntu12_32/steam-launch-wrapper -- '/home/john/.local/share/Steam/steamapps/common/SteamLinuxRuntime_soldier'/_v2-entry-point --verb=waitforexitandrun -- '/home/john/.local/share/Steam/steamapps/common/Proton 5.13'/proton waitforexitandrun '/home/john/.local/share/Steam/steamapps/common/Untitled Goose Game/Untitled.exe'", ProcID 14418, IP 0.0.0.0:0
chdir /home/john/.local/share/Steam/steamapps/common/Untitled Goose Game
ERROR: ld.so: object '/home/john/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
ERROR: ld.so: object '/home/john/.local/share/Steam/ubuntu12_64/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS64): ignored.
GameAction [AppID 837470, ActionID 1] : LaunchApp changed task to WaitingGameWindow with ""
ERROR: ld.so: object '/home/john/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
ERROR: ld.so: object '/home/john/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
ERROR: ld.so: object '/home/john/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
ERROR: ld.so: object '/home/john/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
GameAction [AppID 837470, ActionID 1] : LaunchApp changed task to Completed with ""
ThreadGetProcessExitCode: no such process 14529
ThreadGetProcessExitCode: no such process 14527
ThreadGetProcessExitCode: no such process 14420
Game process updated : AppID 837470 "PROTON_LOG=1 /home/john/.local/share/Steam/ubuntu12_32/reaper SteamLaunch AppId=837470 -- /home/john/.local/share/Steam/ubuntu12_32/steam-launch-wrapper -- '/home/john/.local/share/Steam/steamapps/common/SteamLinuxRuntime_soldier'/_v2-entry-point --verb=waitforexitandrun -- '/home/john/.local/share/Steam/steamapps/common/Proton 5.13'/proton waitforexitandrun '/home/john/.local/share/Steam/steamapps/common/Untitled Goose Game/Untitled.exe'", ProcID 14528, IP 0.0.0.0:0
Installing breakpad exception handler for appid(steam)/version(1665100899)
Installing breakpad exception handler for appid(steam)/version(1665100899)
Steam: An X Error occurred
X Error of failed request: BadMatch (invalid parameter attributes)
Major opcode of failed request: 148
Serial number of failed request: 338
xerror_handler: X failed, continuing
Hello @jhnphm, please copy your system information from Steam (Steam -> Help -> System Information) and put it in a gist, then include a link to the gist in this issue report.
Hello @jhnphm, please copy your system information from Steam (
Steam->Help->System Information) and put it in a gist, then include a link to the gist in this issue report.
I've copied it into the updated post above in sysinfo.log but also here: https://gist.github.com/jhnphm/f9e45d04d374cb9613386ac094b5e50a
Thanks, AMDVLK has a history of breaking other Vulkan driver implementations. If you remove / disable AMDVLK and use mesa/RADV instead, are you able to reproduce this scenario?
12:42:52.860029: pressure-vessel-wrap[27962]: I: Vulkan ICD #0 at /usr/share/vulkan/icd.d/amd_icd32.json: /usr/lib32/amdvlk32.so AMDVLK is still in the mix in your test.
Ah, left the 32-bit amdvlk in the mix. New test:
For reference, this is a working run w/ the NVIDIA GPU unbound, run w/o prime-run: slr-app837470-t20221007T134748.log steam-837470.log
For apples to apples, nonworking run, NVIDIA GPU bound, w/o prime-run: slr-app837470-t20221007T135131.log steam-837470.log
A working NVIDIA GPU bound, w/o prime-run, on Proton 5.0:
steam-837470.log (couldn't find the steam runtime logfiles for some reason)
A working NVIDIA GPU bound, w/ prime-run, on Proton 5.0:
slr-app1420170-t20221007T135842.log
Basically combination of 5.13+ AND the NVIDIA GPU bound to the host but not necessarily active (doesn't make a difference whether prime-run is used or not) breaks.
Actually, I'm not even able to launch winecfg in the prefix w/ the NVIDIA GPU bound:
john@thor [02:27:47 PM] [~]
-> % export GAMEID=837470
john@thor [02:28:11 PM] [~]
-> % WINEPREFIX=~/.steam/steam/steamapps/compatdata/$GAMEID/pfx/ WINEARCH=win64 .steam/steam/steamapps/common/Proton\ 7.0/dist/bin/wine64 'winecfg.exe'
wineserver: using server-side synchronization.
wine: RLIMIT_NICE is <= 20, unable to use setpriority safely
wine: Unhandled page fault on execute access to 00007F2D614EF3D0 at address 00007F2D614EF3D0 (thread 00cc), starting debugger...
00c4:err:winediag:nodrv_CreateWindow Application tried to create a window, but no driver could be loaded.
00c4:err:winediag:nodrv_CreateWindow The explorer process failed to start.
john@thor [02:28:15 PM] [~]
Installing vulkan-mesa-layers/lib32-vulkan-mesa-layers (https://bbs.archlinux.org/viewtopic.php?id=279672) helps running winecfg and untitled goose game directly w/ proton, but it still breaks if prime-run is enabled or if it's run through steam w/ the common error signature:
00c4:err:winediag:nodrv_CreateWindow Application tried to create a window, but no driver could be loaded.
00c4:err:winediag:nodrv_CreateWindow The explorer process failed to start.
Potentially related: https://www.reddit.com/r/linux_gaming/comments/rvzu5p/cant_run_winelutrisproton_apps_on_a_gpu_thats_not/ . It looks like I can get this to work, at least to start winecfg from the command line, if I bind the GPU before starting X, but that means I can no longer unbind it for passing it through to a VM w/o restarting X. Running untitled goose game from steam still doesn't work though.
Most other native applications like vkcube and Proton <= 5.0 work fine on the nVidia dGPU w/o Xorg started after binding to the GPU, so it does seem like a Proton/Wine regression.
I can get prime-run to work w/ the scripts generated by using PROTON_DUMP_DEBUG_COMMANDS, if I switch to wayland, but I still can't get it to run via the steam GUI. Looks like bypassing the steam runtime with the arch steam-native script works too.
This might be a Proton regression, but you said that Proton <= 5.0 is good and 5.13+ is bad, which suggests that one important factor might be whether you're using the SteamLinuxRuntime_soldier container runtime (which is used by Proton 5.13+, and optionally for native Linux games) or not (Proton <= 5.0 and most native Linux games).
However, there were also a lot of non-container-runtime-related changes between Proton 5.0 and 5.13, so it's also possible that this is genuinely a Proton problem and nothing to do with the container runtime.
Multi-GPU is complicated, Proton is complicated, and SteamLinuxRuntime is complicated, so the combination of the three gets very confusing. Please try to narrow down where the problem is, with as few complicated things involved as possible:
- Get the overall system into the state where (some? all?) games are crashing on launch.
- Get the
Help -> System Informationwhile in that state (this runs some simple diagnostic tools). The Gist you provided was before removing AMDVLK, so its results are not necessarily the same as what you're seeing now. If you alter the system state during testing (binding/unbinding the GPU, etc.), please get a newSystem Informationdump matching each log, so that we can compare them. - Install a native Linux game that uses OpenGL. Counter-Strike: Global Offensive is free-to-play and actively maintained, and uses OpenGL by default. Floating Point is a much simpler free-to-play OpenGL game which can be useful to get a baseline for what a very simple scenario looks like.
- Also install a native Linux game that uses Vulkan. CS:GO will use Vulkan if you set its launch options to
%command% -vulkan, which makes it useful for apples-to-apples comparisons between OpenGL and Vulkan. - In the Properties of each of those games, go into the
Compatibilitytab, checkForce the use of a specific Steam Play compatibility tool, and chooseSteam Linux Runtimefrom the list. This will result in those games running in aSteamLinuxRuntime_soldiercontainer (the same as Proton 5.13+) with some compatibility glue to provide the same libraries as the traditionalscoutSteam Runtime. - Try launching those games, and see whether they work or not.
- If they work but Proton games do not, then this is probably a Proton problem.
- If the native Linux games have the same issues in
Steam Linux Runtimeas the Proton games did, then this is probably a Steam Linux Runtime problem. To confirm, uncheckForce the use of a specific Steam Play compatibility toolfor each game and try again. - If CS:GO works when the Launch Options are left blank but fails when they're set to
%command% -vulkan, then this is a Vulkan-specific problem. Recent versions of Proton also use Vulkan when emulating most DirectX versions.
A working NVIDIA GPU bound, w/o prime-run, on Proton 5.0: (couldn't find the steam runtime logfiles for some reason)
The SteamLinuxRuntime_soldier container runtime is not used for Proton 5.0, so it is correct and expected that you will not get a SteamLinuxRuntime_soldier/var/slr-*.log for Proton 5.0 games.
A working NVIDIA GPU bound, w/ prime-run, on Proton 5.0: steam-837470.log slr-app1420170-t20221007T135842.log
These logs don't match: if it was using Proton 5.0, then you wouldn't get a slr-*.log for that run. slr-app1420170-t20221007T135842.log seems to be an unrelated log from running Proton\ 5.13/proton run /home/john/.local/share/Steam/ubuntu12_32/../bin/d3ddriverquery64.exe (see the first line).
-> % WINEPREFIX=~/.steam/steam/steamapps/compatdata/$GAMEID/pfx/ WINEARCH=win64 .steam/steam/steamapps/common/Proton\ 7.0/dist/bin/wine64 'winecfg.exe'
This is unsupported: Proton 5.13+ is intended to always be run in the SteamLinuxRuntime_soldier container environment, not on the host system. However, if this is also failing with the same symptoms as in the container runtime, then that suggests that the problem might be with Proton and not the container runtime.
Looks like bypassing the steam runtime with the arch steam-native script works too
This is also unsupported: the steam-for-linux binaries are intended to always be run with the (older, LD_LIBRARY_PATH-based) Steam Runtime, which is what steam-native disables. Scripts in the Steam Runtime are responsible for choosing whether to take each library from your host system or from the runtime (in most cases whichever one is newer must be used).
I'm surprised that steam-native has any effect on the container runtime - it only disables the older, LD_LIBRARY_PATH-based runtime mechanism (used by Steam itself, Proton <= 5.0 and most native Linux games) and shouldn't do anything to the container runtime. If steam-native vs. steam-runtime makes a difference, then there must be some relatively subtle interaction going on.
Are you sure you are running steam-native in exactly the same way that you were running Steam with the normal Steam Runtime enabled, so that the only difference is -native or not?
One thing that might be significant here is that if you run Steam from a desktop environment shortcut, most desktop environments will try to launch it on a discrete or non-default GPU using PRIME or similar (via PrefersNonDefaultGPU=true and X-KDE-RunOnDiscreteGpu=true), but if you run it from a command-line prompt, that will not take effect. So I wonder whether the difference might really be that you are running steam-native from a terminal (therefore on your default GPU), but running Steam in its normal supported mode from a desktop shortcut (therefore on your discrete GPU)?
More recent sysinfo w/ amdvlk disabled: https://gist.github.com/jhnphm/535dc9ee4154fee34648c712fc357eab
CS:GO works natively both w/ OpenGL and w/ Vulkan, and w/ the runtime set to Steam Linux Runtime. so it seems to really be a Proton issue as opposed to a runtime issue.
The steam-native thing seems to be a red-herring, probably messed up some testing w/ GPU in a bad state or some other weird transient problem. I can get Steam running Proton games w/ the latest Proton normally w/ dGPU bound under Wayland though.
It might have something to do w/ binding the GPU after Xorg is started to keep Xorg from binding to it and making it un-unbindable for VMs w/o restarting the DE. [EDIT Nope, makes no difference].
Multi-GPU used to work on Xorg when I was using an AMD dGPU w/ an AMD iGPU, but the AMD card (Vega64) had other issues w/ VFIO that necessitated running Xorg instead of Wayland. I guess since it now all works under Wayland I can just use that since it works on Wayland, but if it's useful to chase this down I can provide more information.
Wayland sysinfo: https://gist.github.com/jhnphm/d378f7601301736401c72c684f6c6e3d