salt icon indicating copy to clipboard operation
salt copied to clipboard

[BUG] salt-master - Incorrect ownership of cache files causes gitfs error

Open jlrcontegix opened this issue 7 months ago • 2 comments

Description We run salt-master as root. After some period of time (seems random), we begin to see 'dubious ownership' errors in /var/log/salt/master:

2025-06-12 19:38:36,229 [salt.utils.gitfs :2897][ERROR   ][3681789] Exception caught while fetching gitfs remote 'ssh://[email protected]/athing/scripts.git': Cmd('git') failed due to: exit code(128)
  cmdline: git fetch -v -- origin
  stderr: 'fatal: detected dubious ownership in repository at '/var/cache/salt/master/gitfs/CoHLW+aHiI6cj11mg8+pCMjwYFJBDJlLitI+CFYGgEE=/_''
Traceback (most recent call last):
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/gitfs.py", line 2888, in fetch_remotes
    if repo.fetch():
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/gitfs.py", line 943, in fetch
    return self._fetch()
  File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/utils/gitfs.py", line 1591, in _fetch
    fetch_results = origin.fetch()
  File "/opt/saltstack/salt/extras-3.10/git/remote.py", line 1076, in fetch
    res = self._get_fetch_info_from_stderr(proc, progress, kill_after_timeout=kill_after_timeout)
  File "/opt/saltstack/salt/extras-3.10/git/remote.py", line 902, in _get_fetch_info_from_stderr
    proc.wait(stderr=stderr_text)
  File "/opt/saltstack/salt/extras-3.10/git/cmd.py", line 834, in wait
    raise GitCommandError(remove_password_if_present(self.args), status, errstr)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
  cmdline: git fetch -v -- origin
  stderr: 'fatal: detected dubious ownership in repository at '/var/cache/salt/master/gitfs/CoHLW+aHiI6cj11mg8+pCMjwYFJBDJlLitI+CFYGgEE=/_''

When looking at the gitfs cache files complained about in the logs we see they are owned by the salt user instead of root:

[root@salt]# ls -laF /var/cache/salt/master/gitfs/CoHLW+aHiI6cj11mg8+pCMjwYFJBDJlLitI+CFYGgEE=/_
total 0
drwxr-xr-x 3 salt salt  18 Jun 12 19:38 ./
drwxr-xr-x 3 salt salt  15 Jun 12 19:38 ../
drwxr-xr-x 8 salt salt 149 Jun 12 19:39 .git/

Changing the ownership to root:root resolves the issue, but why are the files being created as the salt user when we run as root? This seems to be a 'known' issue, but it has recurred on the same salt-master for us several times over the last few months: https://knowledge.broadcom.com/external/article/378183/salt-master-fails-to-start-with-error-ex.html

Other directories within the cache are owned by the wrong user as well:

[root@salt]# pwd
/var/cache/salt/master

[root@salt]# ls -l
total 16
drwxr-xr-x  4 salt salt   32 Jun 12 17:15 file_lists
drwxr-xr-x 32 salt salt 4096 Jun 12 19:39 gitfs
drwxr-x--- 48 root root 4096 Jun 12 20:16 jobs
drwxr-xr-x 42 salt salt 4096 Jun 12 20:16 minions
drwxr-x---  2 root root    6 Jun 12 13:05 proc
drwxr-x---  2 root root    6 Jun 12 13:05 queues
drwxr-x---  3 salt salt   35 Jun 12 13:05 roots
drwxr-xr-x  2 salt salt   23 Jun 12 19:38 saltapi
drwxr-xr-x  2 salt salt 4096 Jun 12 20:16 sessions
drwxr-x---  2 root root    6 Jun 12 13:05 syndics
drwxr-x---  2 root root    6 Jun 12 13:05 tokens

We have removed the entirety of the cache as a test, and again, after some period of time there is a mismatch of ownership within /var/cache/salt/master.

Setup Running salt-master on Rocky 8.10 and setting "user: root" in the master config.

Steps to Reproduce the behavior We aren't sure when this began exactly, but seen it occur on 3006.12, 3007.3, and 3007.4.

Expected behavior Cache files should be created as the user running salt-master.

Screenshots If applicable, add screenshots to help explain your problem.

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)
Salt Version:
          Salt: 3006.12

Python Version:
        Python: 3.10.17 (main, Jun  9 2025, 20:41:48) [GCC 11.2.0]

Dependency Versions:
          cffi: 1.17.1
      cherrypy: unknown
  cryptography: 42.0.5
      dateutil: 2.8.1
     docker-py: Not Installed
         gitdb: 4.0.11
     gitpython: 3.1.44
        Jinja2: 3.1.6
       libgit2: Not Installed
  looseversion: 1.0.2
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 22.0
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.19.1
        pygit2: Not Installed
  python-gnupg: 0.4.8
        PyYAML: 6.0.1
         PyZMQ: 23.2.0
        relenv: 0.19.3
         smmap: 5.0.1
       timelib: 0.2.4
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: rocky 8.10 Green Obsidian
        locale: utf-8
       machine: x86_64
       release: 4.18.0-553.54.1.el8_10.x86_64
        system: Linux
       version: Rocky Linux 8.10 Green Obsidian

Additional context Add any other context about the problem here.

jlrcontegix avatar Jun 13 '25 00:06 jlrcontegix

I encountered this a while ago (back on 3007.1?). Worked around it by ensuring that salt runs as the salt user.

At the time I was only trying to run salt as root to work around another bug with the Debian packaging... ~~I will raise an issue for that.~~ PR submitted: https://github.com/saltstack/salt/pull/68073

rmounce avatar Jun 13 '25 05:06 rmounce

It could be that the ownership of something has changed during the package upgrade.

dwoz avatar Jun 16 '25 20:06 dwoz