lighthouse icon indicating copy to clipboard operation
lighthouse copied to clipboard

Pause sync whilst EE offline

Open paulhauner opened this issue 4 years ago • 2 comments

Description

After the merge, Lighthouse will need an EE available via HTTP to verify blocks (payloads, specifically). If it can't call out to the EE, block verification fails. With our present implementation, chains that fail due to offline EEs will not be retried after some number of tries. This may prevent us from following the chain again after a temporarily offline EE.

After chatting with @AgeManning, it seems we can achieve this by:

  • Sync choosing to pause itself when it gets EE-related errors.
  • Adding a new method to the execution_layer which returns a bool indicating if there are any sync/ready EEs available.
  • Sync polling that bool and resuming sync after there's EEs online again.

paulhauner avatar Feb 18 '22 04:02 paulhauner

To be clear, I think we should only pause sync by these rules:

should_pause_sync = match state {
  EngineState::Synced => false,
  EngineState::Offline => true,
  EngineState::Syncing => false,
  EngineState::AuthFailed => true
}

In particular, we do want to keep syncing if the EE is syncing. Our optimistic sync implementation means that this is safe (enough). Tagging @pawanjay176 and @divagant-martian :)

paulhauner avatar Jul 12 '22 04:07 paulhauner

Re-opening since this has not been resolved. The fix for this is a WIP in #3094.

paulhauner avatar Aug 01 '22 00:08 paulhauner

Resolved via #3428

paulhauner avatar Sep 12 '22 08:09 paulhauner