steam-runtime icon indicating copy to clipboard operation
steam-runtime copied to clipboard

Mesa fails to render with libudev.so.0 to libudev.so.1.6.3 symlink and Steam Runtime Soldier 0.20220919.0

Open Ophidiophobia opened this issue 3 years ago • 26 comments

Your system information

  • Steam Runtime Version: latest Steam beta client
  • Distribution (e.g. Ubuntu 18.04): Void Linux
  • Link to your full system information (Help -> System Information) in a Gist:

https://gist.github.com/Ophidiophobia/4c560bcb6819a6ad1adc69d03d36dbff

  • Have you checked for system updates?: [Yes/No] Yes
  • What compatibility tool are you using?: [None / Steam Linux Runtime / Proton 5.13+ / older Proton] Proton 7.0
  • If you are using Steam Linux Runtime for native Linux games: What versions are listed in SteamLinuxRuntime/VERSIONS.txt?
  • If you are using Steam Linux Runtime, or Proton 5.13 or newer: What versions are listed in SteamLinuxRuntime_soldier/VERSIONS.txt?
#Name	Version		Runtime	Runtime_Version	Comment
depot	0.20220919.70			# Overall version number
pressure-vessel	0.20220919.0			
scripts	v0.20220823.0-0-gcc4e44f			# Entry point scripts, etc.
soldier	0.20220919.0	soldier	0.20220919.0	# soldier_platform_0.20220919.0/

Please describe your issue in as much detail as possible:

I try running Windows games through Proton and the games do not start. Most show some kind of error message (different for each game) or they silently abort. Tested with Raft, Borderlands 3, Cyberpunk 2077. Info provided below is only on Cyberpunk 2077.

Expectation: Game starts and shows intro or loads into main menu. In case of Cybyerpunk 2077 the game is marked as running in Steam for some time (less than a minute) and then silently stops without showing any window or other visible activity (launcher is skipped).

Steps for reproducing this issue:

  1. Start Cyberpunk 2077
  2. Wait until REDLauncher opens and click Run to start Cyberpunk (or use the Steam launch option "%command% --launcher-skip" on second run onward)
  3. Wait; the launcher closes, Steam shows Steampunk 2077 as running for some seconds until it looks like Steampunk 2077 was closed without further error indications

Additional Notes:

Workaround: Set Linux Runtime Soldier to use the beta branch "previous release"

It looks like the current Steam Runtime Soldier does not load any vulkan extensions.

Steam Proton Log of borked run with current Steam Runtime Soldier: https://gist.github.com/Ophidiophobia/fb62bb96a9932c07823e7462e7558c60

Steam Proton Log of successful run with previous release of Steam Runtime Soldier: https://gist.github.com/Ophidiophobia/64c23375106eae8bc693f83cb3833905

Ophidiophobia avatar Sep 23 '22 17:09 Ophidiophobia

I'm currently having a quite similar issue with all my recent-ish games. The only exception is that rolling back to the previous version of steam soldier doesn't work (at least for Skyrim Special Edition).

My versions.txt (both for current and previous releases of soldier): Current:

#Name	Version		Runtime	Runtime_Version	Comment
depot	0.20220919.70			# Overall version number
pressure-vessel	0.20220919.0			
scripts	v0.20220823.0-0-gcc4e44f			# Entry point scripts, etc.
soldier	0.20220919.0	soldier	0.20220919.0	# soldier_platform_0.20220919.0/

Previous:

#Name	Version		Runtime	Runtime_Version	Comment
depot	0.20220804.66			# Overall version number
pressure-vessel	0.20220803.0			
scripts	v0.20220803.0-0-gbca628e			# Entry point scripts, etc.
soldier	0.20220803.0	soldier	0.20220803.0	# soldier_platform_0.20220803.0/

Logs and system info running Skyrim Special Edition (proton versions 5.0, 6.3, 7.0):

  • Proton 5.0 - working: https://gist.github.com/hjeldin/f93ab89098cc26fea83fb78cd5d1302f
  • Proton 5.0 - not working: https://gist.github.com/hjeldin/ce14acf39b181f76546b812c181f0dbd
  • Proton 6.3 - not working: https://gist.github.com/hjeldin/997d0179b660eac42ce0930449b1167f
  • Proton 7.0 - not working: https://gist.github.com/hjeldin/4d0abb370c77f80ed94e324acaa150fe

Apparently running Skyrim SE with proton 5.0 worked only once. Now that i rebooted it doesn't work anymore.

As already said, Skyrim isn't the only game that doesn't run. So far i've tried with Skyrim SE, Hardspace: Shipbreaker and Sea of Thieves. Sid Meier's Civilization VI and X4: Foundations work, but those are linux native ports so they don't count.

I noticed that issue started occurring after a system update and an install of lutris and it's whole wine stack. @Ophidiophobia did you install or update lutris recently?

hjeldin avatar Sep 23 '22 20:09 hjeldin

@hjeldin I suspect your issue is unrelated. Consider making a separate report.

TTimo avatar Sep 23 '22 21:09 TTimo

This issue seems to be caused by libudev.so.1:

        "libudev.so.1" : {
          "messages" : [
            "Unable to find the library: libudev.so.1: cannot open shared object file: No such file or directory"
          ],
          "soname" : null,
          "path" : null,
          "issues" : [
            "cannot-load"
          ],
          "exit-status" : 1
        },

Probably due to this unexpected Soldier override:

 "overrides/lib/x86_64-linux-gnu/libudev.so.0 -> /run/host/usr/lib/libudev.so.1.6.3",

This should mean that you have libudev.so.1.6.3 in your host system, is that correct? Which package is providing it? Can you please paste the output of ls -la /usr/lib/libudev*

RyuzakiKK avatar Sep 24 '22 10:09 RyuzakiKK

This should mean that you have libudev.so.1.6.3 in your host system, is that correct? Which package is providing it? Can you please paste the output of ls -la /usr/lib/libudev*

/usr/lib/libudev.so -> ../../lib/libudev.so.1.6.3

On Slackware libudev.so.1.6.3 is provided by eudev-compat32-3.2.11-x86_64-1compat32 which is the 32-bit version. The 64-bit package on Slackware is named "eudev-3.2.11-x86_64-1"

2A4U avatar Sep 24 '22 21:09 2A4U

This should mean that you have libudev.so.1.6.3 in your host system, is that correct? Which package is providing it? Can you please paste the output of ls -la /usr/lib/libudev*

$ls -la /usr/lib/libudev*
-rw-r--r-- 1 root root 323162 Jul 24 11:48 /usr/lib/libudev.a
lrwxrwxrwx 1 root root     16 Jul 24 11:48 /usr/lib/libudev.so -> libudev.so.1.6.3
lrwxrwxrwx 1 root root     21 Sep 22  2020 /usr/lib/libudev.so.0 -> /usr/lib/libudev.so.1
lrwxrwxrwx 1 root root     16 Jul 24 11:48 /usr/lib/libudev.so.1 -> libudev.so.1.6.3
-rwxr-xr-x 1 root root 158016 Jul 24 11:48 /usr/lib/libudev.so.1.6.3

libudev.a and libudev.so are installed by eudev-libudev-devel (I think I needed it to compile wine). libudev.so.1.6.3 and the symbolic link libudev.so.1 are installed by eudev-libudev (neccessary package even for a base install). libudev.so.0 was likely created by me to satisfy some application that looked for that file . I cannot remember which one it was.

Ophidiophobia avatar Sep 24 '22 22:09 Ophidiophobia

Symlinking libudev.so.0 to libudev.so.1 is not ABI compatible, and this was a snafu waiting to happen with something (not necessarily with Steam games). Can you remove the manually created symlink and retest?

kisak-valve avatar Sep 24 '22 22:09 kisak-valve

Can you remove the manually created symlink and retest?

I removed the symlink, rebooted, switched to current version of soldier and now things work as expected.

Ophidiophobia avatar Sep 25 '22 00:09 Ophidiophobia

Symlinking libudev.so.0 to libudev.so.1 is not ABI compatible, and this was a snafu waiting to happen with something (not necessarily with Steam games). Can you remove the manually created symlink and retest?

On Slackware there are no libudev symlinks in the 32-bit folder /usr/lib or /lib.

/usr/lib64/libudev.so.1 points to libudev.so.0 which points to ../../lib64/libudev.so.1.6.3

The deleted symlink /usr/lib64/libudev.so.1 gets automatically regenerated after reboot.

So far, only reverting to a previous version of Steam Runtime Soldier works on Slackware.

2A4U avatar Sep 25 '22 11:09 2A4U

@2A4U eudev-3.2.11 provides lib64/libudev.so.1.6.3 and in doinst.sh a symlink libudev.so.1 -> libudev.so.1.6.3 and libdev.so -> ../../lib64/libudev.so.1.6.3. I can't find any reference to libudev.so.0 at all. Where is the symlink libudev.so.0 coming from?

RyuzakiKK avatar Sep 25 '22 12:09 RyuzakiKK

I can't find any reference to libudev.so.0 at all. Where is the symlink libudev.so.0 coming from?

Unknown.

I deleted /usr/lib64/libudev.so.0 and /usr/lib64/libudev.so.1 and now games start with the recent Steam Runtime Soldier.

2A4U avatar Sep 26 '22 06:09 2A4U

@2A4U Thanks for confirming. Please ensure that the libudev.so.1 -> libudev.so.1.6.3 symlink is still in place though. If it didn't get recreated automatically after the reboot, maybe you could force a reinstall of eudev, or do something similar.

@kisak-valve I guess we can close this. If you want to have a more descriptive title for historic purposes, maybe we could change the title to mention libudev.so.0.

RyuzakiKK avatar Sep 26 '22 07:09 RyuzakiKK

Closing per the last several comments.

kisak-valve avatar Sep 26 '22 12:09 kisak-valve

I don't know why a libudev.so.0 symlink would have this effect, but we might be able to avoid this sort of thing happening again by adjusting the changes made in https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/merge_requests/498 so they only affect libudev.so.1 and not libudev.so.0.

smcv avatar Sep 26 '22 14:09 smcv

@Ophidiophobia, @2A4U: On your Void Linux and Slackware systems, what is the "official" SONAME of the library that provides libudev.so.1? You can find out by running something like this:

objdump -T -x /usr/lib64/libudev.so.1 | less

and looking for SONAME under the Dynamic Section heading.

For comparison, on my Debian system (which uses "original" libudev as provided by systemd, rather than eudev), what I see is:

/usr/lib/x86_64-linux-gnu/libudev.so.1:     file format elf64-x86-64
...
Dynamic Section:
  NEEDED               libc.so.6
  NEEDED               ld-linux-x86-64.so.2
  SONAME               libudev.so.1                      <--- this is the header I want to know your equivalent of
  INIT                 0x0000000000004000
...

smcv avatar Sep 26 '22 15:09 smcv

The deleted symlink /usr/lib64/libudev.so.1 gets automatically regenerated after reboot.

This suggests that the answer to my question is probably libudev.so.1, similar to what I see on Debian, but confirmation for this would be useful.

smcv avatar Sep 26 '22 15:09 smcv

As @kisak-valve said, having libudev.so.0 as a symlink to libudev.so.1 is not entirely safe: the "version 1" ABI is not fully backwards-compatible with the "version 0" ABI (if it was compatible, then the version would not have been increased from 0 to 1).

If old binaries want libudev.so.0, the safer way to provide it is either something like the Steam Runtime that literally provides an older libudev, or something similar to the libudev0-shim originating in Arch Linux (also available in e.g. Debian) which provides stub versions of the small number of symbols in the "version 0" ABI that are no longer present in the "version 1" ABI.

However, because the version 0 and version 1 ABIs are almost compatible, you're probably not the only people to have had this configuration, so I'm looking into why this made the container fail and what can be done to avoid it. I will probably need more information from someone using an affected OS, because I haven't been able to reproduce an equivalent issue on a test system so far.

smcv avatar Sep 26 '22 17:09 smcv

@Ophidiophobia, @2A4U: On your Void Linux and Slackware systems, what is the "official" SONAME of the library that provides libudev.so.1? You can find out by running something like this:

objdump -T -x /usr/lib64/libudev.so.1 | less
$ objdump -T -x /usr/lib64/libudev.so.1 | head -50
[...]
Dynamic Section:
  NEEDED               libc.so.6
  NEEDED               ld-linux-x86-64.so.2
  SONAME               libudev.so.1
  INIT                 0x0000000000005000

libudev.so.1 is official. In regards to this issue the existence of ibudev.so.0 was a user error.

As I already stated I created the .0 symlink myself 2 years ago. It was needed to run some binary or to compile some source code which was not part of Void Linux (and considering it wanted libudev.so.0 it was likely very old).

I can not speak for 2A4U. He did not provide his Steam System info in this thread. I do not know if his issue is because of the same lib or because he has a different lib with similar issues.

Ophidiophobia avatar Sep 26 '22 17:09 Ophidiophobia

OK, that's what I expected; but in that case I don't understand why I couldn't reproduce this.

@Ophidiophobia, would you be able to temporarily create the .0 symlink again, and capture a detailed log of a game failing to launch in that state? You can delete the .0 symlink afterwards. I realise the .0 symlink was misconfiguration, but it would be better if we could cope more gracefully with that misconfiguration, and at the moment I don't understand why it's having the effect that it does.

The Proton logs you attached in the original report are useful for Proton bugs, but they don't have the lower-layer information I need for Steam Linux Runtime bugs. Instead, please set a Proton game's Launch Options to:

STEAM_LINUX_RUNTIME_LOG=1 STEAM_LINUX_RUNTIME_VERBOSE=1 CAPSULE_DEBUG=all %command%

and then reproduce the bug (try to launch it), and find the log file at .../SteamLinuxRuntime_soldier/var/slr-latest.log (that path will be a symbolic link to the actual log).

The log will be quite large, but it should compress well. Please either attach it here, or send it privately to smcv at collabora dot com, or upload to something like Dropbox and send me a link.

After collecting the log, you can remove the .0 symlink and change the game's launch options back to normal.

smcv avatar Sep 26 '22 18:09 smcv

@2A4U might have the same root cause for this as @Ophidiophobia because you both share some common factors that are potentially significant: you had a libudev.so.0 symlink, pointing to the same library as libudev.so.1; removing the symlink resolved the problem for you; and your libudev library is provided by eudev, whereas most Steam users will be using systemd's libudev.

Or, you might have a different root cause. I can't tell from the information available. A detailed log (see my previous comment) and a Help -> System Information report would be a useful way to find out whether there is something that the container runtime framework could be doing to make this more reliable.

smcv avatar Sep 26 '22 18:09 smcv

I used

STEAM_LINUX_RUNTIME_LOG=1 STEAM_LINUX_RUNTIME_VERBOSE=1 CAPSULE_DEBUG=all xDXVK_ASYNC=1 %command% --launcher-skip -xskipStartscreen

xDXVK_ASYNC=1 DXVK_ASYNC but "commemnted out" for test purposes --launcher-skip saves a few clicks and (I assume) makes the log a lot shorter -xskipStartscreen supposed to do something but "commented out" like DXVK_ASYNC above

The log is just 2.8 MB (I expected at least an order of magnitude more). https://gist.github.com/Ophidiophobia/469e9f43384ba873a85f824304a917ad

Ophidiophobia avatar Sep 26 '22 23:09 Ophidiophobia

Like all the best bugs, this is more complicated than just one cause. The important factors seem to be:

  • the libudev.so.0 -> libudev.so.1 or libudev.so.0 -> libudev.so.1.x.y symlink
  • the system libudev.so.1.x.y compares as older than the soldier container's libudev.so.1.6.13
    • in both Void and Slackware this is because eudev is derived from an older version of the original udev and so declares that it is ABI-compatible with an older version of libudev, but I think it would also be possible for this to happen on a non-eudev host system with a version of the original libudev that is genuinely that old, like perhaps Debian 9
  • newer versions of the container runtime try to take both libudev.so.1 and libudev.so.0 from the host system, to make it more likely that game controllers will work in games that bypass SDL and Proton, and instead use libudev directly, on a machine where something like libudev0-shim is installed

Because your libudev.so.1 compares as older than the container's, we don't want to use your libudev.so.1 (we have to always use whichever version is newer, otherwise it can cause games to fail with missing symbols). The container infrastructure correctly detects this and does not create /usr/lib/pressure-vessel/overrides/lib/*/libudev.so.1.

However, your libudev.so.0 symlink is "captured" into /usr/lib/pressure-vessel/overrides/lib/*, because the container runtime doesn't have libudev.so.0. This would be fine if it was a separate library with SONAME libudev.so.0 like libudev0-shim, but because it's actually just a symlink to libudev.so.1, the result is that the container infrastructure gets into a strange state where ldconfig thinks your libudev.so.0 is an implementation of libudev.so.1, but there is actually no libudev.so.1 symlink, so anything that requires libudev.so.1 will fail to load.

smcv avatar Sep 27 '22 10:09 smcv

Since this is actively being pondered, I'm re-opening this issue report.

kisak-valve avatar Sep 27 '22 12:09 kisak-valve

We've been able to reproduce a similar failure mode with the help of @Ophidiophobia's log. A future update to the container runtime should avoid the crash in this situation.

smcv avatar Sep 27 '22 13:09 smcv

As a short-term solution for this, pressure-vessel 0.20220927.0 just doesn't pick up libudev.so.0 at all. This change should be in the next SteamLinuxRuntime_soldier beta.

If you want to try this change early, ~~you can do so by replacing SteamLinuxRuntime_soldier/pressure-vessel with the result of unpacking this: https://repo.steampowered.com/pressure-vessel/snapshots/0.20220927.0/pressure-vessel-bin.tar.gz. If you have replaced pressure-vessel like that, please mention it in any other issue reports you make (issue reports are still welcome, but we need to be able to keep track of which versions each issue reporter is using).~~ [edit: no longer necessary]

You can revert to the default version of pressure-vessel by doing a Verify Integrity on "Steam Linux Runtime - soldier".

@RyuzakiKK is looking into a longer-term solution for this - we're hoping we can pick up libudev0-shim, while still treating libudev.so.0 -> libudev.so.1 or libudev.so.0 -> libudev.so.1.x.y symlinks as misconfiguration and ignoring them.

smcv avatar Sep 27 '22 18:09 smcv

As a short-term solution for this, pressure-vessel 0.20220927.0 just doesn't pick up libudev.so.0 at all. This change should be in the next SteamLinuxRuntime_soldier beta.

This change is now available in the client_beta branch, and it will be promoted to the default branch after it has had some more testing. To have this change, SteamLinuxRuntime_soldier/VERSIONS.txt needs to say pressure-vessel 0.20220927.0 or later.

smcv avatar Oct 03 '22 16:10 smcv

looking into a longer-term solution for this - we're hoping we can pick up libudev0-shim, while still treating libudev.so.0 -> libudev.so.1 or libudev.so.0 -> libudev.so.1.x.y symlinks as misconfiguration and ignoring them

This change will be in a future beta.

smcv avatar Oct 06 '22 13:10 smcv

looking into a longer-term solution for this - we're hoping we can pick up libudev0-shim, while still treating libudev.so.0 -> libudev.so.1 or libudev.so.0 -> libudev.so.1.x.y symlinks as misconfiguration and ignoring them

This improvement landed in today's new betas (soldier 0.20221018.74 and sniper 0.20221018.57, the important version number here is pressure-vessel 0.20221014.0 or later in VERSIONS.txt).

smcv avatar Oct 18 '22 17:10 smcv

Also in today's round of updates, the shorter-term solution described in https://github.com/ValveSoftware/steam-runtime/issues/533#issuecomment-1265730662 was promoted to the default branch and no longer needs a beta. To have this change, VERSIONS.txt needs to say pressure-vessel 0.20220927.0 or later.

smcv avatar Oct 18 '22 18:10 smcv

Thanks for investigating. It looks likes we're done here. Closing.

kisak-valve avatar Nov 19 '22 02:11 kisak-valve