xdag icon indicating copy to clipboard operation
xdag copied to clipboard

Pools sometimes reporting unexpected state

Open kbs1 opened this issue 6 years ago • 0 comments

When there are fewer and fewer nodes in the network (as is happening now), the pools often like to report their state as "Trying to connect to the main network". When in reality, they are synced.

Main blocks match, difficulty also matches, and is the same on other nodes that report "normal operation" at that time.

On xdag.org, we have implemented a debug feature called "difficulties sync", which overrides reported pool state to normal operation if:

  1. the pool is NOT loading blocks from local storage
  2. simulate desync (another debug feature of ours) is disabled
  3. the difficulties match
  4. number of main blocks match

The problem is I had to replicate these changes in XDAG explorer (only quick patch on node - not in github code), as it reports it's "synchronizing" any time the node enters the "Trying to connect to to the main network" state.

This state tends to last for about 3 minutes. Then the node switches to "synchronizing", followed by "normal operation". Then the cycle may trigger again without any apparent reason.

The whole time this is happening, the node

  1. is synchronized
  2. the CPU or disk is not very much utilized
  3. difficulties match with other nodes reporting as synchronized
  4. number of main blocks match
  5. there are active nodes in "net conn" output, and the packets / bytes counters are increasing all the time, and dropped packets stay constant (almost always 0)

I would like to not keep the "difficulties sync" override of pool state in XDAG explorer for very long. The check isn't the cleanest, and doesn't really protect in cases where the node would really get disconnected from the main network.

In that case, the difficulties would also still match, main blocks would also match, as it would be the only node in the network. And since the block explorer must be correct, this "override" is not very suitable for the project.

Without the override, we had frequent "block explorer is currently synchronizing" error pages as the node reported an unexpected daemon state.

Thank you! :)

kbs1 avatar Apr 10 '19 20:04 kbs1