eclair icon indicating copy to clipboard operation
eclair copied to clipboard

Channels and nodes keep getting pruned then downloaded again

Open sstone opened this issue 3 years ago • 5 comments

Running the latest master branch (after https://github.com/ACINQ/eclair/pull/2380) with a channel to horizon (which does not yet have https://github.com/ACINQ/eclair/pull/2380 afaik) I keep seeing in the logs nodes and channels getting pruned. For example (happens with about 12000 channels):

2022-08-18 00:53:36,519 INFO  f.a.eclair.router.Router - pruning shortChannelId=528181x1103x0 (stale)
2022-08-18 00:54:59,142 INFO  f.a.eclair.router.Router SYN n:03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f - channel shortChannelId=528181x1103x0 is back from the dead! requesting announcements about this channel
2022-08-18 01:02:08,070 INFO  f.a.eclair.router.Router SYN n:03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f - validating shortChannelId=528181x1103x0
2022-08-18 01:53:33,874 INFO  f.a.eclair.router.Router - pruning shortChannelId=528181x1103x0 (stale)
2022-08-18 01:55:39,803 INFO  f.a.eclair.router.Router SYN n:03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f - channel shortChannelId=528181x1103x0 is back from the dead! requesting announcements about this channel
2022-08-18 02:02:58,126 INFO  f.a.eclair.router.Router SYN n:03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f - validating shortChannelId=528181x1103x0
2022-08-18 02:53:33,716 INFO  f.a.eclair.router.Router - pruning shortChannelId=528181x1103x0 (stale)
2022-08-18 02:55:15,422 INFO  f.a.eclair.router.Router SYN n:03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f - channel shortChannelId=528181x1103x0 is back from the dead! requesting announcements about this channel
2022-08-18 03:02:16,040 INFO  f.a.eclair.router.Router SYN n:03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f - validating shortChannelId=528181x1103x0
2022-08-18 03:53:30,260 INFO  f.a.eclair.router.Router - pruning shortChannelId=528181x1103x0 (stale)
2022-08-18 03:54:54,782 INFO  f.a.eclair.router.Router SYN n:03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f - channel shortChannelId=528181x1103x0 is back from the dead! requesting announcements about this channel
2022-08-18 04:01:50,285 INFO  f.a.eclair.router.Router SYN n:03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f - validating shortChannelId=528181x1103x0
2022-08-18 04:53:32,269 INFO  f.a.eclair.router.Router - pruning shortChannelId=528181x1103x0 (stale)
2022-08-18 04:55:46,839 INFO  f.a.eclair.router.Router SYN n:03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f - channel shortChannelId=528181x1103x0 is back from the dead! requesting announcements about this channel
2022-08-18 05:02:56,789 INFO  f.a.eclair.router.Router SYN n:03864ef025fde8fb587d989186ce6a4a186895ee44a926bfc370e2c366597a3f8f - validating shortChannelId=528181x1103x0

and:

2022-08-18 00:54:49,762 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 01:55:28,783 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 02:55:05,713 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 03:54:43,609 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 04:55:33,217 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 05:55:04,920 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 06:54:50,142 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 07:55:53,136 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 08:55:09,046 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 09:54:52,352 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 10:55:42,531 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 11:55:13,588 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 12:54:51,964 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 13:56:04,738 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 14:55:10,676 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 15:55:01,797 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)
2022-08-18 16:57:33,840 INFO  f.a.eclair.router.Router - pruning nodeId=02009947c197575f5a948e1e4343c41dc2e6122a9bd644629afb919f30e1115ff8 (stale)

sstone avatar Aug 18 '22 17:08 sstone

This is worth investigating before we release a version of eclair containing that pruning change, there may be an issue here.

t-bast avatar Aug 19 '22 07:08 t-bast

That is very suspicious, the shortChannelId you linked has recent updates in both directions (with timestamps 2022-08-17T23:41:47Z and 2022-08-18T20:50:49Z), I don't understand how they can get flagged as stale...

t-bast avatar Aug 19 '22 09:08 t-bast

I just tried these two updates and StaleChannels.isStale() returns false for both of them...

t-bast avatar Aug 19 '22 09:08 t-bast

I think I understand what is happening. We first run into the following condition:

https://github.com/ACINQ/eclair/blob/323aeec09c57194a669da0fec649f9509de850e0/eclair-core/src/main/scala/fr/acinq/eclair/router/Validation.scala#L435

We remove the channel from the list of pruned channels and re-request its channel_announcement and the associated channel_updates (we've previously deleted the channel updates when we pruned this channel). Since we're in that branch, it means we correctly identified that the channel update was not stale.

But then it looks like we later identify the channel as a whole as stale in the following code:

https://github.com/ACINQ/eclair/blob/a97e88fae19b806d5b81f4956acc388f4490f3d5/eclair-core/src/main/scala/fr/acinq/eclair/router/StaleChannels.scala#L93

My theory is that this happens because we haven't yet re-received the channel updates, one of them is probably None so we prune the channel again. But I don't understand why we're not receiving the channel updates in time, do you see anything in the logs that would explain it @sstone? If that is what happens, I don't see how it's related to #2380, but I may be missing something.

t-bast avatar Aug 19 '22 09:08 t-bast

Could this be because we're not setting any extended gossip query flags here:

https://github.com/ACINQ/eclair/blob/a97e88fae19b806d5b81f4956acc388f4490f3d5/eclair-core/src/main/scala/fr/acinq/eclair/router/Validation.scala#L442

IIUC we should set flags to request the channel updates and we don't?

t-bast avatar Aug 19 '22 09:08 t-bast