Graphical applications crash with Xorg error after being visible for a split second
Your system information
- Steam Runtime Version: Soldier 0.20220803.0
- Distribution (e.g. Ubuntu 18.04): Arch Linux
- Link to your full system information (Help -> System Information) in a Gist: https://gist.github.com/Matoking/daecb6d447f58598de60148cbee0de99
- Have you checked for system updates?: Yes
- What compatibility tool are you using?: Proton 7.0
- If you are using Steam Linux Runtime, or Proton 5.13 or newer: What versions are listed in
SteamLinuxRuntime_soldier/VERSIONS.txt?
#Name Version Runtime Runtime_Version Comment
depot 0.20220804.66 # Overall version number
pressure-vessel 0.20220803.0
scripts v0.20220803.0-0-gbca628e # Entry point scripts, etc.
soldier 0.20220803.0 soldier 0.20220803.0 # soldier_platform_0.20220803.0/
Please describe your issue in as much detail as possible:
When running graphical applications such as winecfg and regedit using Steam Runtime and Proton 7.0, the window will appear for a split second and then crash with the following X11 error:
X Error of failed request: BadWindow (invalid Window parameter)
Major opcode of failed request: 10 (X_UnmapWindow)
Resource id in failed request: 0x9600001
Serial number of failed request: 225
Current serial number in output stream: 227
To reproduce the issue, the wineserver process must be alive for multiple wine calls, and the command to launch the graphical application must be run twice for the crash to occur.
Steps for reproducing this issue:
- Launch a Steam game using Proton 7.0 with the PROTON_DUMP_DEBUG_COMMANDS=1 set.
- Go to the created debug directory (
/tmp/proton_USERNAME) and editrunto removesteam.exefrom the call, and replacewine64withwine; this latter change ensures the X11 error is printed, for whatever reason. - Open the installation directory for "Steam Linux Runtime - Soldier".
- Run the command
./run --share-pid --launcher --filesystem /mnt -- --bus-name "com.foo.TestProton.Test"in the Steam Runtime directory to launch a Steam Runtime session (adjust--filesystemas needed to ensure the runtime directory is mounted inside the container). Keep this command running. - Open another session in
<Steam Runtime directory>/pressure-vessel/binand run the following command./steam-runtime-launch-client --bus-name "com.foo.TestProton.Test" --share-pids --directory /tmp/proton_USERNAME -- ./run cmd.exe. Keep this command running as well, since it ensures awineserverprocess is left running for the duration of the next Wine calls. - Open third session and run the command
./steam-runtime-launch-client --bus-name "com.foo.TestProton.Test" --share-pids --directory /tmp/proton_USERNAME -- ./run winecfgtwice. On first run, the configuration window should appear as normal. On second run, it should show up for a moment and then close itself, with the X11 error appearing on the command line. Any subsequent attempts will crash as well.
~~The issue can be reproduced on Proton 7.0-4 and 5.13-6, but not on Proton 6.3-8.~~ The issue also can't be reproduced if Steam Runtime is not used.
Logs for each command with WINEDEBUG="+timestamp,+pid,+tid,+seh,+debugstr,+loaddll,+mscoree" and --verbose can be found here:
https://gist.github.com/Matoking/9db8e55c8bbf3325a4613db2ffc59cdb
./run --share-pid --launcher --filesystem /mnt -- --bus-name "com.foo.TestProton.Test"
Please be aware that running Proton or SteamLinuxRuntime from outside Steam is not really a supported configuration. The container runtime and Proton both expect to be run from inside Steam in order to behave correctly: if you run them externally, they will not have the expected environment variables set and a lot of things won't work as intended.
The issue can be reproduced on Proton 7.0-4 and 5.13-6, but not on Proton 6.3-8
I think someone with Proton knowledge will need to look at this. The container runtime infrastructure doesn't talk to X11, and if X11 can show a window at all (even intermittently), then the container runtime has done its job by providing the X11 socket and the XAUTHORITY data.
You might find that you get different results by using the client_beta branch of Steam Linux Runtime - soldier, which changed the way it sets up X11 so that it tries to reuse the same display number that's used on the host system (often :0 or :1) instead of remapping it to :99. I don't know whether that will help to solve this problem or not.
If you think this could be a recent regression, it might also be useful to try the previous_release branch of Steam Linux Runtime - soldier and see whether it works there.
Your log mentions some options that you didn't mention in the original issue report, like --pass-env XAUTHORITY --pass-env DISPLAY. This might be interfering with the container runtime setup: the DISPLAY and XAUTHORITY environment variables inside the container are intentionally not the same as on the host system.
If you don't use --pass-env, they should inherit the correct values from the steam-runtime-launcher-service.
I tested both previous_release:
#Name Version Runtime Runtime_Version Comment
depot 0.20220727.64 # Overall version number
pressure-vessel 0.20220726.0
scripts v0.20220726.0-0-ga110829 # Entry point scripts, etc.
soldier 0.20220726.0 soldier 0.20220726.0 # soldier_platform_0.20220726.0/
and client_beta:
#Name Version Runtime Runtime_Version Comment
depot 0.20220919.70 # Overall version number
pressure-vessel 0.20220919.0
scripts v0.20220823.0-0-gcc4e44f # Entry point scripts, etc.
soldier 0.20220919.0 soldier 0.20220919.0 # soldier_platform_0.20220919.0/
Both still cause the crash.
This crash didn't occur before, however, so I'll have to look if I can find the runtime version that introduced the issue. The older versions might be available through Steam depots. I also tried compiling the runtime myself so I could try bisecting the issue more precisely, but that turned out to be more time consuming than I expected.
Your log mentions some options that you didn't mention in the original issue report, like
--pass-env XAUTHORITY --pass-env DISPLAY. This might be interfering with the container runtime setup: theDISPLAYandXAUTHORITYenvironment variables inside the container are intentionally not the same as on the host system.If you don't use
--pass-env, they should inherit the correct values from thesteam-runtime-launcher-service.
I uploaded new logs here:
https://gist.github.com/Matoking/e0459d62b429584fd09731c4dd6da69b
I initially used both parameters, but noticed they didn't affect the result, so I tried again but forgot to update my logs. The end result is the same, though.
Also, I managed to reproduce the crash on Proton 6.3 as well. It takes a little more effort, however, since I had to close the cmd.exe process and reopen it again. It also turns out the issue can't be reproduced fully deterministically on other Proton versions as well: sometimes they don't reproduce the issue immediately, requiring another attempt before the crash starts occurring.
I don't have experience with X11, but a quick lookup of the error in question suggests that there may be a stale handle of some sort that causes the crash, which might explain why it takes at least two attempts for the crash to occur?
@kisak-valve, please could you point Proton people towards this?
This crash didn't occur before
Before when? Possible triggers, other than the container runtime, include:
- upgrading Proton
- upgrading some library on your Arch system
You've tested all the container runtime releases for the last few weeks, so my suggestion would be to look at what else has changed. Does Arch's package manager have an equivalent of /var/log/apt/history.log that would tell you what you upgraded at around the time this started happening?
a quick lookup of the error in question suggests that there may be a stale handle of some sort that causes the crash
The container runtime's involvement in X11 should be mostly limited to "X11 works" or "X11 doesn't work"; anything involving state, windows, etc. is between the X11 client (a Proton/Wine process) and the server (Xorg or Xwayland). It's weird that this is only happening with the container runtime; maybe it's related to differing versions of some library like libxcb or libX11 that is involved in the stateful parts of the X11 protocol?
The issue was reported on the Protontricks repository here and here on September 2. Both of the users use SteamOS 3.3.1 on Steam Deck, and I was able to reproduce the issue on my Arch Linux installation as well.
The issue could have appeared earlier though. I only first noticed it after checking one of the linked bug reports.
I was hitting this exact same problem and banging my head on it for hours, and it's specific to running inside a flatpak.
Turns out it's simply because the application running the runtime needs background permissions.
See: https://github.com/flatpak/flatpak/issues/5427 https://www.reddit.com/r/flatpak/comments/15tzx0w/flatpak_apps_close_a_few_seconds_after_opening/
For reference -- we built ULWGL around the steam runtime, we are launching non-steam games using the steam runtime + custom scripts to pass the required envvars it needs and a custom proton version. When running inside flatpak the application -would- run for a second or so before completely closing. After finding the above mentioned flatpak issue we added this to our flatpak:
- --talk-name=org.freedesktop.portal.Background
And it resolved the issue. If that does not work you can also try enabling the 'Background' toggle in flatseal
It should also be noted that this appears to happen with flatpak-builder builds installed from the build folder. When built then installed from a local repo the issue did not occur.
I was hitting this exact same problem
Are you sure it was the exact same problem? Including the characteristic BadWindow (invalid Window parameter) in X_UnmapWindow?
The symptom "visible for a split second, and then crashes with BadWindow" described on this issue is not the same as "visible for a second or so, and then killed with SIGKILL", even though it's superficially similar.
After finding the above mentioned flatpak issue we added this to our flatpak:
--talk-name=org.freedesktop.portal.Background
This should never be necessary. Flatpak is hard-coded to do the equivalent of --talk-name=org.freedesktop.portal.* without any further action from you - the whole point of portals is that they're something that is safe to give to every sandboxed app, because they have taken responsibility for prompting the user for permission where necessary.
If that does not work you can also try enabling the 'Background' toggle in flatseal
If your app was being killed by the Background portal, then I think you'll find that this is actually how you resolved the problem.
Since xdg-desktop-portal 1.18, it should log a message to the systemd Journal whenever it does this, which will look like:
Terminating app xyz (process 12345) because the app does not have permission to run in the background. You may be able to grant this app the permission to run in background in the system settings of your desktop environment.
xdg-desktop-portal >= v1.17 also allows apps to run in the background by default, but pre-existing apps might have an old entry in the permissions database: see https://github.com/flatpak/flatpak/issues/5427#issuecomment-1826775192.
If you have an older version of xdg-desktop-portal, I would recommend upgrading if possible.