zos
zos copied to clipboard
Zos did not retry across the full set of tfchain nodes when one becomes unreachable.
We're noticing some cases where nodes are failing to submit uptime reports, apparently due to the current outage of tf chain node 04.tfchain.grid.tf (which is known and is being addressed). The node logs show that the node is repeatedly trying to contact the IP of that one tfchain node and not trying the others. Two examples are shown below.
Node 5471
Node 493
Node ID: 7195