[BUG] Hourly presence change events received on master
Description
We're currently running into the following with v3006.1: Regardless of any actual changes, the master receives presence/change every hour for all minions. This is not the behavior we had in v3002, and presence change events in our environment can trigger loads of orchestrate states to run, and all at once now.
Setup (Please provide relevant configs and/or SLS files (be sure to remove sensitive info. There is no general set-up of Salt.)
Please be as specific as possible and give set-up details.
- [ ] on-prem machine
- [ ] VM (Virtualbox, KVM, etc. please specify)
- [ ] VM running on a cloud service, please be explicit and add details
- [x] container (Kubernetes, Docker, containerd, etc. please specify) LXD
- [ ] or a combination, please be explicit
- [ ] jails if it is FreeBSD
- [ ] classic packaging
- [x] onedir packaging
- [ ] used bootstrap to install
Steps to Reproduce the behavior debug logging:
2023-06-18 04:04:43,670 [salt.utils.event :823 ][DEBUG ][22091] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T02:04:43.670825'}
2023-06-18 05:05:16,638 [salt.utils.event :823 ][DEBUG ][24066] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T03:05:16.638313'}
2023-06-18 06:05:49,553 [salt.utils.event :823 ][DEBUG ][25224] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T04:05:49.553452'}
2023-06-18 07:06:22,526 [salt.utils.event :823 ][DEBUG ][26719] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T05:06:22.526803'}
2023-06-18 08:06:55,540 [salt.utils.event :823 ][DEBUG ][28012] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T06:06:55.540804'}
2023-06-18 09:07:28,490 [salt.utils.event :823 ][DEBUG ][29283] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T07:07:28.490620'}
2023-06-18 10:08:01,368 [salt.utils.event :823 ][DEBUG ][30765] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T08:08:01.368235'}
2023-06-18 11:08:34,233 [salt.utils.event :823 ][DEBUG ][31988] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T09:08:34.233706'}
2023-06-18 12:09:07,234 [salt.utils.event :823 ][DEBUG ][33340] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T10:09:07.234497'}
2023-06-18 13:09:40,190 [salt.utils.event :823 ][DEBUG ][34530] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T11:09:40.190707'}
2023-06-18 14:10:13,058 [salt.utils.event :823 ][DEBUG ][36180] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T12:10:13.058599'}
2023-06-18 15:10:46,088 [salt.utils.event :823 ][DEBUG ][37798] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T13:10:46.088279'}
2023-06-18 16:11:19,078 [salt.utils.event :823 ][DEBUG ][39355] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T14:11:19.078218'}
2023-06-18 17:11:52,016 [salt.utils.event :823 ][DEBUG ][40622] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T15:11:52.016171'}
2023-06-18 18:12:24,950 [salt.utils.event :823 ][DEBUG ][42013] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T16:12:24.950058'}
2023-06-18 19:12:57,931 [salt.utils.event :823 ][DEBUG ][43556] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T17:12:57.931541'}
2023-06-18 20:13:30,817 [salt.utils.event :823 ][DEBUG ][44718] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T18:13:30.816871'}
2023-06-18 21:14:03,799 [salt.utils.event :823 ][DEBUG ][46482] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T19:14:03.799817'}
2023-06-18 22:14:46,898 [salt.utils.event :823 ][DEBUG ][47974] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T20:14:46.898101'}
2023-06-18 23:15:19,799 [salt.utils.event :823 ][DEBUG ][49405] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T21:15:19.799625'}
2023-06-19 00:15:52,709 [salt.utils.event :823 ][DEBUG ][50522] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T22:15:52.709098'}
2023-06-19 01:16:25,736 [salt.utils.event :823 ][DEBUG ][52154] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-18T23:16:25.736522'}
2023-06-19 02:16:58,649 [salt.utils.event :823 ][DEBUG ][53452] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T00:16:58.649243'}
2023-06-19 03:17:31,533 [salt.utils.event :823 ][DEBUG ][54792] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T01:17:31.533062'}
2023-06-19 04:18:04,528 [salt.utils.event :823 ][DEBUG ][56938] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T02:18:04.528774'}
2023-06-19 05:18:37,484 [salt.utils.event :823 ][DEBUG ][58051] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T03:18:37.484359'}
2023-06-19 06:19:10,401 [salt.utils.event :823 ][DEBUG ][59426] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T04:19:10.401127'}
2023-06-19 07:19:43,267 [salt.utils.event :823 ][DEBUG ][60559] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T05:19:43.267281'}
2023-06-19 08:20:16,253 [salt.utils.event :823 ][DEBUG ][61818] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T06:20:16.253843'}
2023-06-19 09:20:49,215 [salt.utils.event :823 ][DEBUG ][14118] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T07:20:49.215430'}
2023-06-19 10:21:32,676 [salt.utils.event :823 ][DEBUG ][22258] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T08:21:32.676621'}
2023-06-19 11:22:05,217 [salt.utils.event :823 ][DEBUG ][25328] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T09:22:05.217491'}
2023-06-19 12:22:38,102 [salt.utils.event :823 ][DEBUG ][31828] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T10:22:38.102539'}
2023-06-19 13:23:11,098 [salt.utils.event :823 ][DEBUG ][37177] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T11:23:11.097996'}
2023-06-19 14:23:44,048 [salt.utils.event :823 ][DEBUG ][38463] Sending event: tag = salt/presence/change; data = {'new': ['d201d0cd-8883-475f-9b57-883336202e46', '8862d39d-77b0-4b7d-9d7c-e47a7bd00ca5', '44da1045-eb8a-4264-b0ab-d4953a8730ca'], 'lost': [], '_stamp': '2023-06-19T12:23:44.048204'}
Relevant master config:
presence_events: True
tcp_keepalive: True
tcp_keepalive_idle: 10
tcp_keepalive_cnt: 3
tcp_keepalive_intvl: 10
Expected behavior
presence/change events are only fired when minions are actually disconnected / gone or initiate a new TCP connection
Screenshots If applicable, add screenshots to help explain your problem.
Versions Report
salt --versions-report
(Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)Salt Version:
Salt: 3006.1
Python Version:
Python: 3.10.11 (main, May 5 2023, 02:31:54) [GCC 11.2.0]
Dependency Versions:
cffi: 1.14.6
cherrypy: unknown
dateutil: 2.8.1
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
Jinja2: 3.1.2
libgit2: Not Installed
looseversion: 1.0.2
M2Crypto: Not Installed
Mako: Not Installed
msgpack: 1.0.2
msgpack-pure: Not Installed
mysql-python: Not Installed
packaging: 22.0
pycparser: 2.21
pycrypto: Not Installed
pycryptodome: 3.9.8
pygit2: Not Installed
python-gnupg: 0.4.8
PyYAML: 5.4.1
PyZMQ: 23.2.0
relenv: 0.12.3
smmap: Not Installed
timelib: 0.2.4
Tornado: 4.5.3
ZMQ: 4.3.4
System Versions:
dist: centos 7.6.1810 Core
locale: utf-8
machine: x86_64
release: 5.15.0-56-generic
system: Linux
version: CentOS Linux 7.6.1810 Core
Additional context Add any other context about the problem here.
Possibly caused by #63941
Possibly caused by https://github.com/saltstack/salt/pull/63941
Agreed! Possible fix at https://github.com/saltstack/salt/pull/64508
We have the same issue here: Salt Minion Versions: Salt Version: Salt: 3006.3
Python Version: Python: 3.10.13 (main, Sep 6 2023, 02:11:27) [GCC 11.2.0]
Dependency Versions: cffi: 1.14.6 cherrypy: 18.6.1 dateutil: 2.8.1 docker-py: Not Installed gitdb: Not Installed gitpython: Not Installed Jinja2: 3.1.2 libgit2: Not Installed looseversion: 1.0.2 M2Crypto: Not Installed Mako: Not Installed msgpack: 1.0.2 msgpack-pure: Not Installed mysql-python: Not Installed packaging: 22.0 pycparser: 2.21 pycrypto: Not Installed pycryptodome: 3.9.8 pygit2: Not Installed python-gnupg: 0.4.8 PyYAML: 6.0.1 PyZMQ: 23.2.0 relenv: 0.13.10 smmap: Not Installed timelib: 0.2.4 Tornado: 4.5.3 ZMQ: 4.3.4
System Versions: dist: ubuntu 22.04.4 jammy locale: utf-8 machine: x86_64 release: 5.15.0-97-generic system: Linux version: Ubuntu 22.04.4 jammy
Salt Master Version:
Salt Version: Salt: 3006.7
Python Version: Python: 3.10.13 (main, Feb 19 2024, 03:31:20) [GCC 11.2.0]
Dependency Versions: cffi: 1.14.6 cherrypy: unknown dateutil: 2.8.1 docker-py: Not Installed gitdb: Not Installed gitpython: Not Installed Jinja2: 3.1.3 libgit2: Not Installed looseversion: 1.0.2 M2Crypto: Not Installed Mako: Not Installed msgpack: 1.0.2 msgpack-pure: Not Installed mysql-python: Not Installed packaging: 22.0 pycparser: 2.21 pycrypto: Not Installed pycryptodome: 3.19.1 pygit2: Not Installed python-gnupg: 0.4.8 PyYAML: 6.0.1 PyZMQ: 23.2.0 relenv: 0.15.1 smmap: Not Installed timelib: 0.2.4 Tornado: 4.5.3 ZMQ: 4.3.4
System Versions: dist: ubuntu 22.04.4 jammy locale: utf-8 machine: x86_64 release: 5.15.0-97-generic system: Linux version: Ubuntu 22.04.4 jammy
About every hour all minions marked as "new" within event "/presence/changed" although there was no "lost" event before.
We are using a python script as systemd service to alert us in case of presence change:
import logging import salt.config import salt.utils.event from create_opsgenie_alert import create_opsgenie_alerts
logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%d.%m.%Y %H:%M:%S ', level=logging.INFO)
opts = salt.config.client_config("/etc/salt/master/")
event = salt.utils.event.get_event("master", sock_dir=opts["sock_dir"], opts=opts)
for data in event.iter_events(tag="salt/presence/change"): logging.info("TAG = salt/presence/change") logging.info("Data:\n" + data + "\n") logging.info(data.get('new', '--')) logging.info(data.get('lost', '--')) # print(data.get('_stamp')) alertmsg = "Salt Minion Presence State changed..." message = "Minion presence changed:\n new: {},\n lost: {}".format(data.get('new'), data.get('lost')) logging.info(message) # fire opsgenie alert create_opsgenie_alerts(alertmsg,'Message:\n {}'.format(message), "Salt Minion State Change Alert","P3")
Apologies, I need to update the tests for the fix in https://github.com/saltstack/salt/pull/64508 still then it should be ready to merge. My evenings will be busy for a while so I can't say when I will have time to do this
Any news here? We handling every day 24 alerts because of this issue because we are checking the salt minion status with that event. It would be nice to have a workaround until we have a fix, if there is one.
Sorry, I have an infant taking up all my spare time :) There is only one test case needed to be added if I recall. I can write down what it is if you want to take a stab at it. It's possible it will require a logic change but I don't recall honestly, it's been to long
You can always patch your master with the change from https://github.com/saltstack/salt/pull/64508, that is what I have done locally FWIW