salt
salt copied to clipboard
[BUG] Master shows duplicate minion IDs after being offline for a couple of hours
Description This is a weird one, the master was down for a couple of hours, after starting the master and running test.ping, the master shows multiple duplicate minion ID, minions are connected in multimaster configuration. This is not fixed until restarting the minions to force reconnection to the master. Another weird issue, I see multiple start events from the minions (https://github.com/saltstack/salt/issues/66341).
All is hosted in Azure cloud (maybe the virtual network layer has anything to do ?), 3006.x versions. I waited for about 10 minutes to see if it self heals, but the issue continued until restarting the minion.
[root@vesselsim ~]# salt \*-ems\* test.ping
vesselsim-win-ems-2:
True
vesselsim-win-ems-2:
True
vesselsim-win-ems-1:
True
vesselsim-win-ems-1:
True
[root@vesselsim ~]# salt \*-ems\* test.ping
vesselsim-win-ems-2:
True
vesselsim-win-ems-2:
True
vesselsim-win-ems-2:
True
vesselsim-win-ems-1:
True
vesselsim-win-ems-1:
True
vesselsim-win-ems-1:
True
[root@vesselsim ~]# salt \*-ems\* test.ping
vesselsim-win-ems-2:
True
vesselsim-win-ems-2:
True
vesselsim-win-ems-1:
True
vesselsim-win-ems-1:
True
[root@vesselsim ~]# salt \*-ems\* test.ping
vesselsim-win-ems-2:
True
vesselsim-win-ems-2:
True
vesselsim-win-ems-2:
True
vesselsim-win-ems-1:
True
vesselsim-win-ems-1:
True
vesselsim-win-ems-1:
True
[root@vesselsim ~]# salt \*-ems\* service.restart salt-minion --async
Executed command with job ID: 20240418151619456071
[root@vesselsim ~]# salt-run jobs.lookup_jid 20240418151619456071
vesselsim-win-ems-1:
True
vesselsim-win-ems-2:
True
[INFO ] Runner completed: 20240418151634997026
[root@vesselsim ~]# salt \*-ems\* test.ping
vesselsim-win-ems-2:
True
vesselsim-win-ems-1:
True
[root@vesselsim ~]# salt \*-ems\* test.version
vesselsim-win-ems-2:
3006.4
vesselsim-win-ems-1:
3006.4
[root@vesselsim ~]# salt --version
salt 3006.4 (Sulfur)