rust-lightning icon indicating copy to clipboard operation
rust-lightning copied to clipboard

Doc Clarity: Persist ChannelMonitors and MonitorUpdates sequentially

Open G8XSU opened this issue 1 year ago • 8 comments

Clarified in documentation that users must persist ChannelMonitors and ChannelMonitorUpdates in sequential order, even more so when using async persistence.

As part of #1792

G8XSU avatar Apr 12 '24 17:04 G8XSU

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.27%. Comparing base (5e41425) to head (29451bf). Report is 859 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2992      +/-   ##
==========================================
+ Coverage   89.26%   92.27%   +3.01%     
==========================================
  Files         118      131      +13     
  Lines       96534   128888   +32354     
  Branches    96534   128888   +32354     
==========================================
+ Hits        86167   118928   +32761     
+ Misses       7872     7380     -492     
- Partials     2495     2580      +85     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Apr 12 '24 17:04 codecov-commenter

This isn't true, though. We don't require it, but have recommended it as it works around some current bugs that we intend to fix.

If we recommend it to clients, isn't it better to have it as part of documentation? It is ok to remove this when it is no longer needed imo.

Isn't it possible to write the latest monitor update first and then overwrite it with an out-of-date ChannelMonitor if the writes end up being out-of-order?

G8XSU avatar Apr 15 '24 17:04 G8XSU

If we recommend it to clients, isn't it better to have it as part of documentation? It is ok to remove this when it is no longer needed imo.

It doesn't make the async persistence in general "safe". We can't reasonably enumerate all the things users have to be aware of when doing async persistence, because we also don't know all of those things. I'm not sure that listing one example is worth it or if it just gives people a false sense of security (at the cost of a ton of extra code complexity on the implementation end). We should focus on fixing the issues, really, IMO.

Isn't it possible to write the latest monitor update first and then overwrite it with an out-of-date ChannelMonitor if the writes end up being out-of-order?

For an individual monitor, indeed, we care about consistency - the latest one has to be the one we have on disk, but of course if the user is persisting the updates then they can go in any order, we just have to replay all of them in order on reload.

TheBlueMatt avatar Apr 15 '24 18:04 TheBlueMatt

if the user is persisting the updates then they can go in any order.

I was talking about full channelMonitor persists resulting from monitor-updates. If there is an out-order-persist, ldk will think we persisted durably but we lost data by persisting the old monitor.

G8XSU avatar Apr 15 '24 18:04 G8XSU

Ah, okay, yea, I guess we can specify that. ISTM we should probably do that in the Persist interface docs, though, and would be good to clarify that we're talking about any full persistences.

TheBlueMatt avatar Apr 15 '24 18:04 TheBlueMatt

ISTM we should probably do that in the Persist interface docs, though

The change is already in Persist interface docs?

would be good to clarify that we're talking about any full persistences.

Even for MonitorUpdatePartial persist, we ideally want them to be in order. This is because current implementation of MonitorUpdatingPersister depends on this "updates are sequential" fact.

If we write update_id .. 7, 8, 10 and don't write update_id 9. update_id 9 will be missing from storage, we will treat its absence as end of further monitor_updates.

Even if we remove MUP from the picture. Persisting update_id 10, without update_id 9, puts us in inconsistent state afaiu, might cause channel closure later? (will it be ok if we could never persist update-9?)

It might be troublesome for multi-device as well, device-1 saved update-10, device-2 saved update-9, now this state can't be resolved.

Overall i see multiple things that can go wrong or are just complicated with out-of-order writes for partial-monitor-updates as well.

G8XSU avatar Apr 16 '24 04:04 G8XSU

The change is already in Persist interface docs?

Uhhhhhh, right. Not sure what I meant.

This is because current implementation of MonitorUpdatingPersister depends on this "updates are sequential" fact.

These are separate things, though. The Persist trait isn't tied to MonitorUpdatingPersister's implementation details, its more generic. We shouldn't be documenting things in Persist that only apply to users of MonitorUpdatingPersister (and some kind of strange async-write-MonitorUpdatingPersister-read setup?)

.Even if we remove MUP from the picture. Persisting update_id 10, without update_id 9, puts us in inconsistent state afaiu, might cause channel closure later? (will it be ok if we could never persist update-9?)

Not strictly, no, but we should document what to do - the ChainMonitor doesn't let the ChannelManager know we're persisted until there's no pending persists at all, so as far as the ChannelManager is concerned we just haven't made any progress, which also means on startup the user is free to ignore any monitor updates after gaps and load without them.

TheBlueMatt avatar May 06 '24 15:05 TheBlueMatt

Whats the status here @G8XSU

TheBlueMatt avatar Jun 03 '24 16:06 TheBlueMatt

  • Removed previous doc update, as monitor-updates can be persisted out-of-order.
  • Instead added doc to indicate "user is free to ignore any monitor updates after gaps and load without them."

G8XSU avatar Aug 20 '24 17:08 G8XSU