salt icon indicating copy to clipboard operation
salt copied to clipboard

[BUG] SaltStack 3006.x `file` Function State Execution Performance Degradation

Open cianyleow opened this issue 2 years ago • 9 comments

Description of the tech debt to be addressed, include links and screenshots

When upgrading to SaltStack 3006.x, a significant performance regression was identified in state execution, resulting in large increases to execution time for certain state functions.

A side-by-side analysis of the same state application between 3005.1 and 3006.3 below highlights the performance regression seen:

State Function Average 3005.1 Timing Average 3006.3 Timing Average % Increase
cmd.run 451.12ms 447.01ms -0.91%
cmd.wait 1.22ms 0.77ms -37.16%
file.directory 1.10ms 1.14ms 2.81%
file.managed 77.56ms 484.13ms 524.20%
file.serialize 5.94ms 248.95ms 4,093.13%
pkg.latest 2.50s 2.55s 1.81%
service.dead 52.55ms 54.87ms 4.41%

As can be seen in the data, the file.managed and file.serialize functions are 5-40x slower (on average) after this change, which was traced back to the re-initialization of the file client every time it was required.

In particular, this change (#63984), which adjusted object lifecycle management for the fileclient, was identified as the initial driver of the performance impact. Coupled with a bug fix (#64113), caching of file client initialisation in the cp and defaults modules as well as the jinja utils was removed, which accounted for the increased state execution timings seen.

For reference, an internal patch was applied to reintroduce file client initialisation caching, and it mitigated the performance impact completely.

Versions Report

Salt: 3006.3 Debian: Bookworm (12)

cianyleow avatar Oct 24 '23 12:10 cianyleow

Can you provide the salt --versions report and some indication of your fileserver config? This is a bug, not technical debt.

OrangeDog avatar Oct 24 '23 12:10 OrangeDog

We're using the default fileserver configuration (we don't set anything as far as I'm aware), with no changes between our setup for 3005.1 and 3006.3.

Here's the salt --versions report for the 3005.1 and 3006.3 Master/Minion pairs used for the testing:

salt@old-master: salt --versions
Salt Version:
          Salt: 3005.1

Dependency Versions:
          cffi: 1.15.1
      cherrypy: unknown
      dateutil: 2.8.2
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.2
       libgit2: Not Installed
      M2Crypto: 0.38.0
          Mako: Not Installed
       msgpack: 1.0.3
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.11.0
        pygit2: Not Installed
        Python: 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0]
  python-gnupg: 0.4.9
        PyYAML: 6.0
         PyZMQ: 24.0.1
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: debian 12 bookworm
        locale: utf-8
       machine: x86_64
       release: 6.4.16-linuxkit
        system: Linux
       version: Debian GNU/Linux 12 bookworm
------------------------------

salt@old-minion$ salt-call --versions
Salt Version:
          Salt: 3005.1

Dependency Versions:
          cffi: 1.15.1
      cherrypy: Not Installed
      dateutil: 2.8.2
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.2
       libgit2: Not Installed
      M2Crypto: 0.38.0
          Mako: Not Installed
       msgpack: 1.0.3
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.11.0
        pygit2: Not Installed
        Python: 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0]
  python-gnupg: Not Installed
        PyYAML: 6.0
         PyZMQ: 24.0.1
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: debian 12 bookworm
        locale: utf-8
       machine: aarch64
       release: 6.4.16-linuxkit
        system: Linux
       version: Debian GNU/Linux 12 bookworm
------------------------------

salt@new-master: salt --versions
Salt Version:
          Salt: 3006.3

Python Version:
        Python: 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0]

Dependency Versions:
          cffi: 1.15.1
      cherrypy: Not Installed
      dateutil: 2.8.2
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.2
       libgit2: Not Installed
  looseversion: 1.0.3
      M2Crypto: 0.38.0
          Mako: Not Installed
       msgpack: 1.0.3
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 23.0
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.11.0
        pygit2: Not Installed
  python-gnupg: 0.4.9
        PyYAML: 6.0
         PyZMQ: 24.0.1
        relenv: Not Installed
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: debian 12.0 bookworm
        locale: utf-8
       machine: x86_64
       release: 6.4.16-linuxkit
        system: Linux
       version: Debian GNU/Linux 12.0 bookworm
------------------------------

salt@new-minion$ salt-call --versions
Salt Version:
          Salt: 3006.3

Python Version:
        Python: 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0]

Dependency Versions:
          cffi: 1.15.1
      cherrypy: Not Installed
      dateutil: 2.8.2
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.2
       libgit2: Not Installed
  looseversion: 1.0.3
      M2Crypto: 0.38.0
          Mako: Not Installed
       msgpack: 1.0.3
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 23.0
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.11.0
        pygit2: Not Installed
  python-gnupg: Not Installed
        PyYAML: 6.0
         PyZMQ: 24.0.1
        relenv: Not Installed
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: debian 2023.10.3 bookworm
        locale: utf-8
       machine: aarch64
       release: 6.4.16-linuxkit
        system: Linux
       version: Debian GNU/Linux 2023.10.3 bookworm

cianyleow avatar Oct 26 '23 15:10 cianyleow

The same problem with 3006.4

AVKarelin avatar Nov 13 '23 04:11 AVKarelin

Thanks for the fix - the __file_client__ dunder is ingenious. That being said, I was still seeing a bit of a performance regression during template rendering as fileclient wasn't being passed through in the context here.

I've attempted to fix it by explicitly passing through the file client where this is invoked (the file module and pillar/state implementations) and we use it - would that be helpful for me to try and submit as a patch?

cianyleow avatar Jan 30 '24 17:01 cianyleow

Thanks for the fix - the __file_client__ dunder is ingenious. That being said, I was still seeing a bit of a performance regression during template rendering as fileclient wasn't being passed through in the context here.

I've attempted to fix it by explicitly passing through the file client where this is invoked (the file module and pillar/state implementations) and we use it - would that be helpful for me to try and submit as a patch?

If you are able to submit a patch that would be great, otherwise I will take a look at it.

dwoz avatar Feb 06 '24 20:02 dwoz

Thanks for the fix - the __file_client__ dunder is ingenious. That being said, I was still seeing a bit of a performance regression during template rendering as fileclient wasn't being passed through in the context here. I've attempted to fix it by explicitly passing through the file client where this is invoked (the file module and pillar/state implementations) and we use it - would that be helpful for me to try and submit as a patch?

If you are able to submit a patch that would be great, otherwise I will take a look at it.

@cianyleow I'm going to take a look at this for 3006.8

dwoz avatar Feb 09 '24 01:02 dwoz

@dwoz this is where I got to with my patch: https://github.com/cianyleow/salt/commit/7620c4dd6562be3f09d96a85c63bfd822004b1fc

Hopefully this helps!

cianyleow avatar Feb 11 '24 21:02 cianyleow

@cianyleow Are you able to verify this patch resolves the issue? https://github.com/saltstack/salt/pull/66058/commits/317a6dd44be18441a9b9b579d7ff99d497a074f8s

dwoz avatar Feb 13 '24 22:02 dwoz

@cianyleow Are you able to verify this patch resolves the issue? https://github.com/saltstack/salt/pull/66058/commits/317a6dd44be18441a9b9b579d7ff99d497a074f8s

@dwoz I won't be able to get back to this for a couple of weeks, but I think its conceptually similar to how I patched (I went for a more explicit approach for the use cases I have).

cianyleow avatar Feb 14 '24 15:02 cianyleow

Fixed in 3006.8

dwoz avatar May 01 '24 21:05 dwoz