Pause sync whilst EE offline
Description
After the merge, Lighthouse will need an EE available via HTTP to verify blocks (payloads, specifically). If it can't call out to the EE, block verification fails. With our present implementation, chains that fail due to offline EEs will not be retried after some number of tries. This may prevent us from following the chain again after a temporarily offline EE.
After chatting with @AgeManning, it seems we can achieve this by:
- Sync choosing to pause itself when it gets EE-related errors.
- Adding a new method to the
execution_layerwhich returns a bool indicating if there are any sync/ready EEs available. - Sync polling that bool and resuming sync after there's EEs online again.
To be clear, I think we should only pause sync by these rules:
should_pause_sync = match state {
EngineState::Synced => false,
EngineState::Offline => true,
EngineState::Syncing => false,
EngineState::AuthFailed => true
}
In particular, we do want to keep syncing if the EE is syncing. Our optimistic sync implementation means that this is safe (enough). Tagging @pawanjay176 and @divagant-martian :)
Re-opening since this has not been resolved. The fix for this is a WIP in #3094.
Resolved via #3428