WSL icon indicating copy to clipboard operation
WSL copied to clipboard

WSL login nukes systemd/dbus user session / contents of /run/user/1000

Open sarim opened this issue 2 years ago • 21 comments

Windows Version

Microsoft Windows [Version 10.0.22621.1778]

WSL Version

1.3.10.0

Are you using WSL 1 or WSL 2?

  • [X] WSL 2
  • [ ] WSL 1

Kernel Version

5.15.90.2-microsoft-standard-WSL2

Distro Version

Ubuntu 22.04

Other Software

No response

Repro Steps

Make sure guiApplications=true in wslconfig.

  1. Open WSL from Windows Terminal.
  2. Run systemctl --user status
  3. Or Run ls /run/user/1000

Expected Behavior

Excepts systemd user session to be present and systemctl --user to be able connect to it.

This kinda relates to #8842 , but not the same issue. This not the race condition issue, rather WSL login is nuking bus and other sockets in /run/user/1000 directory. Check the attached video demonstration.

If we follow the Following steps:

  1. In Powershell, wsl --shutdown few times to be sure.
  2. wsl -u root -e /bin/bash this makes wsl login to root. So wsl doesn't touch 1000 user (named gittu). this user has linger enabled, so systemd naturally creates the user session.
  3. Here logged in via root, ls /run/user/1000 shows proper bus, systemd etc.. sockets created.
  4. Now open wsl to user gittu (which is default user) by opening a new tab in Windows Terminal.
  5. Observe that [ 29.574698] WSL (2): Creating login session for gittu line appears in dmesg output, confirming that WSL indeed created a user session for gittu, nuking previously good user session created by systemd.
  6. Now output of ls /run/user/1000/ doesn't have bus, systemd etc.. sockets.

Now If I disable wslg, so guiApplications=false. The issue is solved, wsl doesn't nuke contents of /run/user/1000. So from this observation my conclusion is wslg is nuking the contents of /run/user/1000, and only manually putting wslg's files there.

https://github.com/microsoft/WSL/assets/1235888/d32511fa-967f-4337-8da8-f08ea9468856

Actual Behavior

↪ ~ ➤ systemctl --user status
Failed to connect to bus: No such file or directory
↪ ~ ➤ ls /run/user/1000 -l
total 0
drwx------ 3 gittu gittu 60 Jun 16 00:24 dbus-1
drwx------ 2 gittu gittu 80 Jun 16 00:24 pulse
lrwxrwxrwx 1 root  root  31 Jun 16 00:25 wayland-0 -> /mnt/wslg/runtime-dir/wayland-0
-rw-rw---- 1 gittu gittu  0 Jun 16 00:24 wayland-0.lock

Diagnostic Logs

No response

sarim avatar Jun 15 '23 18:06 sarim

@OneBlue - another one related to your /run/usr/ change.

benhillis avatar Jun 15 '23 20:06 benhillis

Thank you for reporting this @sarim.

Interestingly, I can't reproduce the issue. Can you share the output of mount before and after opening WSL with the gittu user ?

WSL does mount an overlayfs on /run/user/X when the session is created, but happens regardless of whether GUI apps are enabled or not so I wonder if there's something else happening here.

OneBlue avatar Jun 21 '23 00:06 OneBlue

.wslconfig

[wsl2]
kernelCommandLine=cgroup_no_v1=all
memory=16GB
swap=0
guiApplications=true
debugConsole=false
#vmIdleTimeout=-1

networkingMode=bridged
vmSwitch=WSLBridged
dhcp=false
macAddress=0E:00:00:00:00:00
ipv6=true

/etc/wsl.conf

[user]
default=gittu
[boot]
systemd=true
[network]
hostname = GITTUW11WSL
generateResolvConf=false

/etc/fstab

cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0

After wsl --shutdown, log into wsl via root user with wsl -u root -e /bin/bash. Then mount and save output. Then I open a new tab in terminal, it logs into gittu user. Run mount and save output. The two txt are attached. Here's the diff.

diff mount-before.txt mount-after.txt

41a42,45
> none on /mnt/wslg/run/user/1000/rw type tmpfs (rw,relatime)
> none on /run/user/1000/rw type tmpfs (rw,relatime)
> none on /mnt/wslg/run/user/1000 type overlay (rw,relatime,lowerdir=/mnt/wslg/runtime-dir,upperdir=/mnt/wslg/run/user/1000/rw/upper,workdir=/mnt/wslg/run/user/1000/rw/work)
> none on /run/user/1000 type overlay (rw,relatime,lowerdir=/mnt/wslg/runtime-dir,upperdir=/mnt/wslg/run/user/1000/rw/upper,workdir=/mnt/wslg/run/user/1000/rw/work)

mount-after.txt mount-before.txt

Btw if I run sudo systemctl restart user@1000, the /run/user/1000/ directory gets restored. Both wslg sockets and systemd sockets and other files are now present here. Thats what I've been doing, after starting wsl, I run the command once. Let me know if you need any more info @OneBlue

sarim avatar Jun 21 '23 00:06 sarim

Also this might also be relevant.

root@GITTUW11WSL:~# ls /run/user/1000/rw
ls: cannot access '/run/user/1000/rw': No such file or directory
root@GITTUW11WSL:~# ls /mnt/wslg/run/user/1000/rw
ls: cannot access '/mnt/wslg/run/user/1000/rw': No such file or directory
root@GITTUW11WSL:~# ls /mnt/wslg/run/user/1000
dbus-1  pulse  wayland-0  wayland-0.lock

Edit:

WSL does mount an overlayfs on /run/user/X when the session is created, but happens regardless of whether GUI apps are enabled or not so I wonder if there's something else happening here.

Umm I don't understand, when guiApplications=false, the "system"/"wslg" distro doesn't get launched. This "/mnt/wslg" directory is shared with that system distro right? So the behavior is definitely changing from how much I understand.

Below outputs are when guiApplications=false. Notice the output of mount is absent of any /mnt/wslg related entry. Though I don't understand how /mnt/wslg directory is created now as there's no such entry in mount output.

↪ ~ ➤ sudo tree /mnt/wslg/
/mnt/wslg/
└── run
    └── user
        └── 1000

3 directories, 0 files
↪ ~ ➤ ls /run/user/1000/
bus  dbus-1  gnupg  pipewire-0  pipewire-0.lock  pk-debconf-socket  podman  systemd
↪ ~ ➤ mount
none on /mnt/wsl type tmpfs (rw,relatime)
none on /usr/lib/wsl/drivers type 9p (ro,nosuid,nodev,noatime,dirsync,aname=drivers;fmask=222;dmask=222,mmap,access=client,msize=65536,trans=fd,rfd=7,wfd=7)
/dev/sdb on / type ext4 (rw,relatime,discard,errors=remount-ro,data=ordered)
none on /usr/lib/wsl/lib type overlay (rw,nosuid,nodev,noatime,lowerdir=/gpu_lib_packaged:/gpu_lib_inbox,upperdir=/gpu_lib/rw/upper,workdir=/gpu_lib/rw/work)
rootfs on /init type rootfs (ro,size=8186832k,nr_inodes=2046708)
none on /dev type devtmpfs (rw,nosuid,relatime,size=8186860k,nr_inodes=2046715,mode=755)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,noatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,noatime)
devpts on /dev/pts type devpts (rw,nosuid,noexec,noatime,gid=5,mode=620,ptmxmode=000)
none on /run type tmpfs (rw,nosuid,nodev,mode=755)
none on /run/lock type tmpfs (rw,nosuid,nodev,noexec,noatime)
none on /run/shm type tmpfs (rw,nosuid,nodev,noatime)
none on /dev/shm type tmpfs (rw,nosuid,nodev,noatime)
none on /run/user type tmpfs (rw,nosuid,nodev,noexec,noatime,mode=755)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime)
tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,relatime,mode=755)
cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
drvfs on /mnt/c type 9p (rw,noatime,dirsync,aname=drvfs;path=C:\;uid=1000;gid=1000;symlinkroot=/mnt/,mmap,access=client,msize=262144,trans=virtio)
drvfs on /mnt/d type 9p (rw,noatime,dirsync,aname=drvfs;path=D:\;uid=1000;gid=1000;symlinkroot=/mnt/,mmap,access=client,msize=262144,trans=virtio)
drvfs on /mnt/e type 9p (rw,noatime,dirsync,aname=drvfs;path=E:\;uid=1000;gid=1000;symlinkroot=/mnt/,mmap,access=client,msize=262144,trans=virtio)
drvfs on /mnt/h type 9p (rw,noatime,dirsync,aname=drvfs;path=H:\;uid=1000;gid=1000;symlinkroot=/mnt/,mmap,access=client,msize=262144,trans=virtio)
drvfs on /mnt/i type 9p (rw,noatime,dirsync,aname=drvfs;path=I:\;uid=1000;gid=1000;symlinkroot=/mnt/,mmap,access=client,msize=262144,trans=virtio)
/dev/sdb on /run/user type ext4 (rw,relatime,discard,errors=remount-ro,data=ordered)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
tmpfs on /run/qemu type tmpfs (rw,nosuid,nodev,relatime,mode=755)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=1638024k,nr_inodes=409506,mode=700,uid=1000,gid=1000)

sarim avatar Jun 21 '23 00:06 sarim

For those that need guiApplications=true, one of these might fix things for you until someone gets this fixed in WSL:

sudo systemctl restart user@1000

or

mkdir ~/run_user_1000
mv /run/user/1000/* ~/run_user_1000/
sudo umount /run/user/1000
mv ~/run_user_1000/* /run/user/1000/
rm -rf ~/run_user_1000/

higordearaujo-bsy avatar Jun 21 '23 20:06 higordearaujo-bsy

I just wrote and add this to my bashrc to restart user session. the command is allowed in sudoers file.

function check_and_restart_session {
    # Check if "/run/user/1000/bus" exists
    if [ -e "/run/user/1000/bus" ]; then
        return 0
    fi

    # Try to avoid race condition
    sleep 0.$(( ( RANDOM % 300 ) + 50 ))

    # Check if "/tmp/gittuRestartSession" exists
    if [ -e "/tmp/gittuRestartSession" ]; then
        echo "/tmp/gittuRestartSession exists"
        return 0
    fi

    # If neither condition is true, restart the session
    touch /tmp/gittuRestartSession
    sudo /usr/bin/systemctl restart [email protected]
    echo "Restart User Session"
}

sarim avatar Jul 04 '23 18:07 sarim

This just started happening to me today, where I actually have a working system for a bit but then it breaks. In journalctl -b0 I can see:

Sep 19 12:14:55 JABAILE-DESK02 systemd[1]: dmesg.service: Deactivated successfully.
Sep 19 12:14:55 JABAILE-DESK02 sudo[736]:  jabaile : TTY=pts/0 ; PWD=/home/jabaile ; USER=root ; COMMAND=/usr/bin/ls -lh /run/user
Sep 19 12:14:55 JABAILE-DESK02 sudo[736]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=1000)
Sep 19 12:14:55 JABAILE-DESK02 sudo[736]: pam_unix(sudo:session): session closed for user root
Sep 19 12:15:05 JABAILE-DESK02 sudo[747]:  jabaile : TTY=pts/0 ; PWD=/home/jabaile ; USER=root ; COMMAND=/usr/bin/ls -lh /run/user
Sep 19 12:15:05 JABAILE-DESK02 sudo[747]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=1000)
Sep 19 12:15:05 JABAILE-DESK02 sudo[747]: pam_unix(sudo:session): session closed for user root
Sep 19 12:15:09 JABAILE-DESK02 systemd[1]: systemd-timedated.service: Deactivated successfully.
Sep 19 12:15:11 JABAILE-DESK02 sudo[773]:  jabaile : TTY=pts/0 ; PWD=/home/jabaile ; USER=root ; COMMAND=/usr/bin/ls -lh /run/user
Sep 19 12:15:11 JABAILE-DESK02 sudo[773]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=1000)
Sep 19 12:15:11 JABAILE-DESK02 sudo[773]: pam_unix(sudo:session): session closed for user root
Sep 19 12:15:22 JABAILE-DESK02 kernel: hv_balloon: Max. dynamic memory size: 32652 MB
Sep 19 12:16:37 JABAILE-DESK02 systemd-networkd-wait-online[187]: Timeout occurred while waiting for network connectivity.
Sep 19 12:16:37 JABAILE-DESK02 systemd[1]: systemd-networkd-wait-online.service: Main process exited, code=exited, status=1/FAILURE
Sep 19 12:16:37 JABAILE-DESK02 systemd[1]: systemd-networkd-wait-online.service: Failed with result 'exit-code'.
Sep 19 12:16:37 JABAILE-DESK02 systemd[1]: Failed to start Wait for Network to be Configured.

I'm running sudo ls -lh /run/user and the first ones show 1000. But then system finishes starting up and then everything that uses this dir breaks.

Running sudo systemctl restart user@1000 works, though this means I have to wait for the failure to occur and then run that command and ensure I have restarted all of my terminals, as I have various programs like fnm which use that dir.

I thought this might be due to 2.0.0.0, but I don't have it yet:

WSL version: 1.2.2.0
Kernel version: 5.15.90.1

Perhaps this is a different issue than this thread; apologies if it is, but this is perfectly reproducible for me so I'm totally happy to try things out.

jakebailey avatar Sep 19 '23 19:09 jakebailey

Not sure if it is the same issue I see. Right after startup is finished WSL seems to stop the User Manager. This is also visible in syslog:

Sep 21 07:32:51 w00wmi systemd[1]: Startup finished in 2min 1.606s.
Sep 21 07:33:00 w00wmi systemd[1]: Stopping User Manager for UID 1000...
Sep 21 07:33:00 w00wmi systemd[1828]: Stopped target Main User Target.
Sep 21 07:33:00 w00wmi systemd[1828]: Stopping D-Bus User Message Bus...
Sep 21 07:33:00 w00wmi systemd[1828]: Stopping PipeWire Media Session Manager...
Sep 21 07:33:00 w00wmi systemd[1828]: Stopped D-Bus User Message Bus.
Sep 21 07:33:00 w00wmi systemd[1828]: Stopped PipeWire Media Session Manager.
Sep 21 07:33:00 w00wmi systemd[1828]: Stopping PipeWire Multimedia Service...
Sep 21 07:33:00 w00wmi systemd[1828]: Stopped PipeWire Multimedia Service.
Sep 21 07:33:00 w00wmi systemd[1828]: Removed slice User Core Session Slice.
Sep 21 07:33:00 w00wmi systemd[1828]: Stopped target Basic System.
Sep 21 07:33:00 w00wmi systemd[1828]: Stopped target Paths.
Sep 21 07:33:00 w00wmi systemd[1828]: Stopped target Sockets.
Sep 21 07:33:00 w00wmi systemd[1828]: Stopped target Timers.
Sep 21 07:33:00 w00wmi systemd[1828]: Closed D-Bus User Message Bus Socket.
Sep 21 07:33:00 w00wmi systemd[1828]: Closed GnuPG network certificate management daemon.
Sep 21 07:33:00 w00wmi systemd[1828]: Closed GnuPG cryptographic agent and passphrase cache (access for web browsers).
Sep 21 07:33:00 w00wmi systemd[1828]: Closed GnuPG cryptographic agent and passphrase cache (restricted).
Sep 21 07:33:00 w00wmi systemd[1828]: Closed GnuPG cryptographic agent (ssh-agent emulation).
Sep 21 07:33:00 w00wmi systemd[1828]: Closed GnuPG cryptographic agent and passphrase cache.
Sep 21 07:33:00 w00wmi systemd[1828]: Closed PipeWire Multimedia System Socket.
Sep 21 07:33:00 w00wmi systemd[1828]: Closed debconf communication socket.
Sep 21 07:33:00 w00wmi systemd[1828]: Closed REST API socket for snapd user session agent.
Sep 21 07:33:00 w00wmi systemd[1828]: Removed slice User Application Slice.
Sep 21 07:33:00 w00wmi systemd[1828]: Reached target Shutdown.
Sep 21 07:33:00 w00wmi systemd[1828]: Finished Exit the Session.
Sep 21 07:33:00 w00wmi systemd[1828]: Reached target Exit the Session.
Sep 21 07:33:00 w00wmi systemd[1]: [email protected]: Deactivated successfully.
Sep 21 07:33:00 w00wmi systemd[1]: [email protected]: Deactivated successfully.
Sep 21 07:33:00 w00wmi systemd[1]: Stopped User Manager for UID 1000.
Sep 21 07:33:00 w00wmi systemd[1]: Stopping User Runtime Directory /run/user/1000...
Sep 21 07:33:00 w00wmi systemd[1]: run-user-1000.mount: Deactivated successfully.
Sep 21 07:33:00 w00wmi systemd[1]: [email protected]: Deactivated successfully.
Sep 21 07:33:00 w00wmi systemd[1]: Stopped User Runtime Directory /run/user/1000.
Sep 21 07:33:00 w00wmi systemd[1]: Removed slice User Slice of UID 1000.

After this, when running sudo systemctl restart user@1000 once it is never stopped again until I restart WSL. I am not sure when exactly it started in my case, but it worked before.

WSL version: 1.2.5.0
Kernel version: 5.15.90.1
WSLg version: 1.0.51
MSRDC version: 1.2.3770
Direct3D version: 1.608.2-61064218
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.19045.3448

thwint avatar Sep 21 '23 06:09 thwint

Just found another interesting article: https://serverfault.com/questions/1139283/systemd-stops-user-manager-and-kills-all-user-processes

In my case I am running Ubuntu 22.04. After enabling lingering for my user it does not seem to be stopped anymore.

So my workaround: loginctl enable-linger 1000

thwint avatar Sep 21 '23 07:09 thwint

sudo systemctl restart user@1000

This solved to me a Failed to connect to bus: No such file or directory issue. Using guiApplications=false here. Thanks @higordearaujo-bsy! 👍

jonathan-f-silva avatar Nov 19 '23 17:11 jonathan-f-silva

Workaround for WSL2 Ubuntu 22.04 Systemd Issue

For anyone running WSL2 Ubuntu 22.04 using systemd and encountering this issue, I found a very simple workaround.

  1. sudo loginctl enable-linger $USER.
  2. Disable systemd in /etc/wsl.conf.
  3. Restart WSL using wsl --shutdown.
  4. Re-enable systemd in /etc/wsl.conf.
  5. Restart WSL again.

Note that I had enabled systemd while running 20.04 and it was working fine. This issue only surfaced when I did an in-place upgrade to 22.04. I'll note that before I figured this out, running sudo systemctl restart user@1000 did fix it, but I had to run it every time I restarted the WSL VM. However, enabling linger alone was having no effect.

I guess "Have you tried turning it off and on again?" never stops being sage advice.

bgupta avatar Jun 30 '24 14:06 bgupta

Just found another interesting article: https://serverfault.com/questions/1139283/systemd-stops-user-manager-and-kills-all-user-processes

In my case I am running Ubuntu 22.04. After enabling lingering for my user it does not seem to be stopped anymore.

So my workaround: loginctl enable-linger 1000

For me this started occurring after disabling GUI application support via the WSL Settings app. Enabling linger for the user seems to have resolved the issue.

Edit: I also disabled Hyper-V Firewall at the same time, but based on the comments in this thread, I figure that's unrelated.

polyzen avatar Jul 20 '24 22:07 polyzen

Also see https://github.com/microsoft/WSL/issues/8879 for more systemd issues

Stanzilla avatar Jul 22 '24 19:07 Stanzilla

Windows Version

Microsoft Windows [Version 10.0.22621.1778]

WSL Version

1.3.10.0

Are you using WSL 1 or WSL 2?

  • [x] WSL 2
  • [ ] WSL 1

Kernel Version

5.15.90.2-microsoft-standard-WSL2

Distro Version

Ubuntu 22.04

Other Software

No response

Repro Steps

Make sure guiApplications=true in wslconfig.

  1. Open WSL from Windows Terminal.
  2. Run systemctl --user status
  3. Or Run ls /run/user/1000

Expected Behavior

Excepts systemd user session to be present and systemctl --user to be able connect to it.

This kinda relates to #8842 , but not the same issue. This not the race condition issue, rather WSL login is nuking bus and other sockets in /run/user/1000 directory. Check the attached video demonstration.

If we follow the Following steps:

  1. In Powershell, wsl --shutdown few times to be sure.
  2. wsl -u root -e /bin/bash this makes wsl login to root. So wsl doesn't touch 1000 user (named gittu). this user has linger enabled, so systemd naturally creates the user session.
  3. Here logged in via root, ls /run/user/1000 shows proper bus, systemd etc.. sockets created.
  4. Now open wsl to user gittu (which is default user) by opening a new tab in Windows Terminal.
  5. Observe that [ 29.574698] WSL (2): Creating login session for gittu line appears in dmesg output, confirming that WSL indeed created a user session for gittu, nuking previously good user session created by systemd.
  6. Now output of ls /run/user/1000/ doesn't have bus, systemd etc.. sockets.

Now If I disable wslg, so guiApplications=false. The issue is solved, wsl doesn't nuke contents of /run/user/1000. So from this observation my conclusion is wslg is nuking the contents of /run/user/1000, and only manually putting wslg's files there.

Screenshot.2023-06-16.00.24.23.mp4

Actual Behavior

↪ ~ ➤ systemctl --user status
Failed to connect to bus: No such file or directory
↪ ~ ➤ ls /run/user/1000 -l
total 0
drwx------ 3 gittu gittu 60 Jun 16 00:24 dbus-1
drwx------ 2 gittu gittu 80 Jun 16 00:24 pulse
lrwxrwxrwx 1 root  root  31 Jun 16 00:25 wayland-0 -> /mnt/wslg/runtime-dir/wayland-0
-rw-rw---- 1 gittu gittu  0 Jun 16 00:24 wayland-0.lock

Diagnostic Logs

No response

Hi My response is not related to your problem... But I got a problem running one of the scripts cuz I'm totally noob

How can I get root login access in windows! Uploading SharedScreenshot.jpg…

ELISSAWII avatar Jul 27 '24 09:07 ELISSAWII

I've just run across this myself in a newly installed Ubuntu 22.04 setup. It's a pretty big out-of-the-box flaw for anyone who actually wants to use user systemd. Is anyone working on it on the WSL team?

maxb avatar Aug 15 '24 09:08 maxb

Workaround for WSL2 Ubuntu 22.04 Systemd Issue

For anyone running WSL2 Ubuntu 22.04 using systemd and encountering this issue, I found a very simple workaround.

1. `sudo loginctl enable-linger $USER`.

2. Disable systemd in `/etc/wsl.conf`.

3. Restart WSL using `wsl --shutdown`.

4. Re-enable systemd in `/etc/wsl.conf`.

5. Restart WSL again.

Note that I had enabled systemd while running 20.04 and it was working fine. This issue only surfaced when I did an in-place upgrade to 22.04. I'll note that before I figured this out, running sudo systemctl restart user@1000 did fix it, but I had to run it every time I restarted the WSL VM. However, enabling linger alone was having no effect.

I guess "Have you tried turning it off and on again?" never stops being sage advice.

sudo loginctl enable-linger $USER works for me, also works after restart wsl

TTcheng avatar Sep 01 '24 07:09 TTcheng

I fixed it by setting guiApplications = false in .wslconfig

ProfessorStrawberry avatar Jun 01 '25 13:06 ProfessorStrawberry

comment out this should work. https://github.com/microsoft/WSL/blob/9cd3438a2f56db12d96a093d12064ca1634f9134/src/linux/init/config.cpp#L443


Tested on my nixos-wsl, which needs the agenix to mount secrets into /run/user/1000

Image

If you are too lazy to build by yourself: link

kurikomoe avatar Jun 01 '25 19:06 kurikomoe

Nice find, @kurikomoe. Looking at the source of UtilMount, it uses mount from without checking whether the directory is empty. And so, all the content becomes inaccessible.

I wonder why the tmpfs is even created and mounted here. Maybe because when systemd is disabled there is nothing there yet? So instead of outright removing the line, the code should check if a tmpfs already exists, and if it does, skip the mounting. Or check if systemd is enabled and if it is, trust that the session is managed by systemd

trallnag avatar Jun 01 '25 19:06 trallnag

After my work computer updated to v2.5.7, I've hit this problem, and here's what I understand.

The Problem

There seems to be a poor interaction between three settings/actions:

  1. Having guiApplications = true in %USERPROFILE%\.wslconfig (Windows side)
  2. Having systemd = true in /etc/wsl.conf and setting loginctl enable-linger $USER (Linux side).
  3. How wsl.exe launches a terminal.

The observable problem is that step 3 performs extra actions like creating a new /run/user/$UID mount point and then mounting individual files to support WSLg (bind-mounting resources that are passed in through /mnt/wslg/runtime-dir)

The problem is that the user-runtime-dir@$UID.service systemd service has already run (because we enabled "lingering") before the WSL login occurs — it happens essentially during boot — and therefore key resources (like the systemd user bus at /run/user/$UID/bus) are shadowed by the second mount.

This can be seen in the double-mounting of /run/user/1000 in the output of findmnt:

├─/run                                              none                    tmpfs       rw,nosuid,nodev,mode=755
│ ├─/run/lock                                       none                    tmpfs       rw,nosuid,nodev,noexec,noatime
│ ├─/run/shm                                        none                    tmpfs       rw,nosuid,nodev,noatime
│ ├─/run/user                                       none                    tmpfs       rw,nosuid,nodev,noexec,noatime,mode=755
│ │ └─/run/user                                     none[/run/user]         tmpfs       rw,relatime
│ │   └─/run/user/1000                              tmpfs                   tmpfs       rw,nosuid,nodev,relatime,size=3266472k,nr_inodes=816618,mode=700,uid=1000,gid=1000
│ │     └─/run/user/1000                            tmpfs                   tmpfs       rw,nosuid,nodev,noexec,relatime,mode=755

It's possible to view the contents of the shadowed mount without unmounting for the whole system by using mount namespaces:

sudo unshare --mount
umount /run/user/$UID
ls -lAh /run/user/1000

which lets you confirm the presence of the desired /run/user/1000/bus user bus socket.

Workaround

Right now a workaround seems to be to take over setting up the WSLg environment variables and symlinks and to ensure that your user is not directly logged into by wsl.exe. If instead you configure Windows Terminal to login as root, then the code linked above instead clobbers root's /run/user/0 directory and leaves our true user's alone.

I did this by overriding the Windows Terminal configuration for my WSL distro profile to use a launch command of

C:\Windows\system32\wsl.exe -d archlinux -u root -- login -f $USER

(where here $USER is just a placeholder — it should be filled in literally).

Caveat: I quite often see

[process exited with code 1 (0x00000001)]
You can now close this terminal with Ctrl+D, or press Enter to restart.

when trying to open a new terminal, and I have to press Enter a non-deterministic number of times to actually get a terminal open. I suspect there's a race condition somewhere due to trying to immediately hand off the terminal that wsl.exe is trying to monitor to a new user, but I haven't dug further to sort that out.

After this change alone, the double-mounting of the user run directory problem is bypassed:

├─/run                                              none                    tmpfs       rw,nosuid,nodev,mode=755
│ ├─/run/lock                                       none                    tmpfs       rw,nosuid,nodev,noexec,noatime
│ ├─/run/shm                                        none                    tmpfs       rw,nosuid,nodev,noatime
│ ├─/run/user                                       none                    tmpfs       rw,nosuid,nodev,noexec,noatime,mode=755
│ │ └─/run/user                                     none[/run/user]         tmpfs       rw,relatime
│ │   ├─/run/user/1000                              tmpfs                   tmpfs       rw,nosuid,nodev,relatime,size=3266472k,nr_inodes=816618,mode=700,uid=1000,gid=1000
│ │   └─/run/user/0                                 tmpfs                   tmpfs       rw,nosuid,nodev,noexec,relatime,mode=755

In order to have the WSLg files appear in your real user run directory, we then take advantage of systemd tmpfiles approach mentioned in the Arch WSL docs and WSLg issue:

Create /etc/user-tmpfiles.d/wslg.conf with contents:

#Type Path              Mode User Group Age Argument
L+    %t/wayland-0      -    -    -     -   /mnt/wslg/runtime-dir/wayland-0
L+    %t/wayland-0.lock -    -    -     -   /mnt/wslg/runtime-dir/wayland-0.lock
L+    %t/pulse/native   -    -    -     -   /mnt/wslg/runtime-dir/pulse/native
L+    %t/pulse/pid      -    -    -     -   /mnt/wslg/runtime-dir/pulse/pid

and enable the user tmpfiles services:

systemctl enable --user --now systemd-tmpfiles-setup.service
systemctl enable --user systemd-tmpfiles-clean.timer

Create /etc/profile.d/wslg.sh with contents:

export DISPLAY=${DISPLAY:-:0}
export WAYLAND_DISPLAY=${WAYLAND_DISPLAY:-wayland-0}

Suggested Fix

I think the fix should be to:

  1. Only run the logic in config.cpp to mount a user run directory if systemd is not used.
  2. Then when systemd is being used, add additional systemd generated units in init.cpp which uses a systemd service to symlinks and/or mount the WSLg files.

System Info

Distro: Archlinux

WSL version info:

> wsl --version
WSL version: 2.5.7.0
Kernel version: 6.6.87.1-1
WSLg version: 1.0.66
MSRDC version: 1.2.6074
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26100.1-240331-1435.ge-release
Windows version: 10.0.22631.5335

jmert avatar Jun 02 '25 14:06 jmert

Workaround

Right now a workaround seems to be to take over setting up the WSLg environment variables and symlinks and to ensure that your user is not directly logged into by wsl.exe. If instead you configure Windows Terminal to login as root, then the code linked above instead clobbers root's /run/user/0 directory and leaves our true user's alone.

I did this by overriding the Windows Terminal configuration for my WSL distro profile to use a launch command of

C:\Windows\system32\wsl.exe -d archlinux -u root -- login -f $USER

Unfortunately this workaround is not for long, after some operations (like running a program or starting another terminal, etc.) the issue is recreated during a running session. (edit: I'm using Fedora42, though)

pfeileon avatar Jun 10 '25 11:06 pfeileon

My workaround is:

Commenting out the default user in: /etc/wsl.conf

# [user]
# default=xxxxxxx

logout

wsl --shutdown

wsl -d yourdistro

Now it will log in as root, so to change to your user, use:

su - xxxxxxx

Now, when we check systemctl --user, it no longer fails.

There's no "Failed to connect to bus: No such file or directory" anymore.

I have this problem after installing Docker, where some installation commands use sudo loginctl enable-linger $USER.

I don't know where, but the bug certainly happens if:

  • A default user is set in wsl.conf
  • You use sudo loginctl enable-linger $USER

So, for my case, disabling the default user is the workaround.

However, there's a hassle to log in as a normal user every time I start a WSL distro.

marhensa avatar Jun 19 '25 03:06 marhensa

I think this issue is fixed in v2.6.0, presumably due to #13101 which effectively implements what I suggested in https://github.com/microsoft/WSL/issues/10205#issuecomment-2931121662:

Suggested Fix

I think the fix should be to:

1. Only run the logic in [`config.cpp`](https://github.com/microsoft/WSL/blob/9cd3438a2f56db12d96a093d12064ca1634f9134/src/linux/init/config.cpp#L436-L436) to mount a user run directory if systemd is not used.

2. Then when systemd is being used, add additional systemd generated units in [`init.cpp`](https://github.com/microsoft/WSL/blob/9cd3438a2f56db12d96a093d12064ca1634f9134/src/linux/init/init.cpp#L239) which uses a systemd service to symlinks and/or mount the WSLg files.

jmert avatar Jun 20 '25 15:06 jmert

This is indeed fixed in 2.6.0 ! Closing

OneBlue avatar Jun 20 '25 19:06 OneBlue

anyone else getting wsl: Failed to start the systemd user session since that update?

wsl: Failed to start the systemd user session for 'stan'. See journalctl for more details.

~
❯ journalctl
Jun 02 22:39:31 STAN-PC kernel: Linux version 6.6.87.1-microsoft-standard-WSL2 (root@af282157c79e)>
Jun 02 22:39:31 STAN-PC kernel: Command line: initrd=\initrd.img WSL_ROOT_INIT=1 panic=-1 nr_cpus=>
Jun 02 22:39:31 STAN-PC kernel: KERNEL supported cpus:
Jun 02 22:39:31 STAN-PC kernel:   Intel GenuineIntel
Jun 02 22:39:31 STAN-PC kernel:   AMD AuthenticAMD
Jun 02 22:39:31 STAN-PC kernel: BIOS-provided physical RAM map:
Jun 02 22:39:31 STAN-PC kernel: BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
Jun 02 22:39:31 STAN-PC kernel: BIOS-e820: [mem 0x00000000000e0000-0x00000000000e0fff] reserved
Jun 02 22:39:31 STAN-PC kernel: BIOS-e820: [mem 0x0000000000100000-0x00000000001fffff] ACPI data
Jun 02 22:39:31 STAN-PC kernel: BIOS-e820: [mem 0x0000000000200000-0x00000000f7ffffff] usable
Jun 02 22:39:31 STAN-PC kernel: BIOS-e820: [mem 0x0000000100000000-0x00000008059fffff] usable
Jun 02 22:39:31 STAN-PC kernel: NX (Execute Disable) protection: active
Jun 02 22:39:31 STAN-PC kernel: APIC: Static calls initialized
Jun 02 22:39:31 STAN-PC kernel: DMI not present or invalid.
Jun 02 22:39:31 STAN-PC kernel: Hypervisor detected: Microsoft Hyper-V
Jun 02 22:39:31 STAN-PC kernel: Hyper-V: privilege flags low 0xae7f, high 0x3b8030, hints 0x900c2c>
Jun 02 22:39:31 STAN-PC kernel: Hyper-V: Nested features: 0x4a0000
Jun 02 22:39:31 STAN-PC kernel: Hyper-V: LAPIC Timer Frequency: 0xc3500
Jun 02 22:39:31 STAN-PC kernel: Hyper-V: Using hypercall for remote TLB flush
Jun 02 22:39:31 STAN-PC kernel: clocksource: hyperv_clocksource_tsc_page: mask: 0xffffffffffffffff>
Jun 02 22:39:31 STAN-PC kernel: clocksource: hyperv_clocksource_msr: mask: 0xffffffffffffffff max_>
Jun 02 22:39:31 STAN-PC kernel: tsc: Detected 3000.158 MHz processor
Jun 02 22:39:31 STAN-PC kernel: e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
Jun 02 22:39:31 STAN-PC kernel: e820: remove [mem 0x000a0000-0x000fffff] usable
lines 1-24

Stanzilla avatar Jun 24 '25 20:06 Stanzilla

@Stanzilla: Could you create a new issue with logs so we can look into it ?

OneBlue avatar Jun 24 '25 20:06 OneBlue

@Stanzilla: Could you create a new issue with logs so we can look into it ?

Will do when it happens again, yes!

Stanzilla avatar Jun 24 '25 20:06 Stanzilla

sudo systemctl restart user@1000

This solved to me a Failed to connect to bus: No such file or directory issue. Using guiApplications=false here. Thanks @higordearaujo-bsy! 👍

In nushell you can use:

sudo systemctl restart user@$"(id -u)"

For bash:

sudo systemctl restart user@$(id -u)

peteut avatar Aug 19 '25 14:08 peteut

Thank you @peteut !

For bash:

sudo systemctl restart user@$(id -u)

Finally I can use docker again in Windows, I have to dualboot to Linux because this weird bug.

marhensa avatar Aug 19 '25 15:08 marhensa

The problem is that the user-runtime-dir@$UID.service systemd service has already run (because we enabled "lingering") before the WSL login occurs — it happens essentially during boot — and therefore key resources (like the systemd user bus at /run/user/$UID/bus) are shadowed by the second mount.

This can be seen in the double-mounting of /run/user/1000 in the output of findmnt:

├─/run none tmpfs rw,nosuid,nodev,mode=755 │ ├─/run/lock none tmpfs rw,nosuid,nodev,noexec,noatime │ ├─/run/shm none tmpfs rw,nosuid,nodev,noatime │ ├─/run/user none tmpfs rw,nosuid,nodev,noexec,noatime,mode=755 │ │ └─/run/user none[/run/user] tmpfs rw,relatime │ │ └─/run/user/1000 tmpfs tmpfs rw,nosuid,nodev,relatime,size=3266472k,nr_inodes=816618,mode=700,uid=1000,gid=1000 │ │ └─/run/user/1000 tmpfs tmpfs rw,nosuid,nodev,noexec,relatime,mode=755

With Ubuntu 22.04 and and while waiting for WSL 2.6 to become stable, I believe I fixed the issue thanks to the valuable information provided by @jmert. When running findmnt, there was also this double-mounting of /run/user/1000.

I then remembered that I had previously enabled lingering, without any result, by the way:

sudo loginctl enable-linger $USER

Well, I simply did the opposite operation:

sudo loginctl disable-linger $USER

Since then, the double-mounting of /run/user/1000 has disappeared although I have done almost 10 system reboots. And /run/user/1000 remains clean.

I don't know if it's related but I previously fixed a boring issue with networkd by restricting /lib/systemd/system/systemd-networkd-wait-online.service to eth0: *

ExecStart=/lib/systemd/systemd-networkd-wait-online -i eth0

I also solved some minor problems with pulseaudio and pipewire but I guess they weren’t related to the original problem.

alexisbg avatar Aug 21 '25 16:08 alexisbg