swarm icon indicating copy to clipboard operation
swarm copied to clipboard

Swarm.Tracker - fix `start_pid_remotely` retrying rapidly

Open darrenclark opened this issue 6 years ago • 1 comments

Noticed a lot of this message in our logs:

remote tracker on #{remote_node} went down during registration, retrying operation..

It seems to happen randomly, but I think this will fix it

darrenclark avatar May 30 '19 23:05 darrenclark

I've seen this also happen to us just today in our production cluster. There haven't been any deploys in like 3 weeks, and everything ran smooth until this happened.

When this happened we got ~11M logs in 2 hours of this retrying and not being able to fix itself. Restarted the pods and then everything got back to normal

We are running 2 pods on a k8s cluster. Lib is a dependency of https://github.com/commanded/commanded-swarm-registry

We are using swarm lib 3.4.0

pirvudoru avatar May 19 '24 12:05 pirvudoru