Node crash due to failed rollback in Hybrid
There is a problem happening all the time to me if I run 2 or more nodes. A node may crash with the following exception:
java.util.NoSuchElementException: versionID not found, can not rollback
at io.iohk.iodb.LSMStore.notFound$1(LSMStore.scala:868)
at io.iohk.iodb.LSMStore.$anonfun$rollback$2(LSMStore.scala:875)
at scala.Option.getOrElse(Option.scala:121)
at io.iohk.iodb.LSMStore.rollback(LSMStore.scala:875)
at examples.hybrid.state.HBoxStoredState.$anonfun$rollbackTo$1(HBoxStoredState.scala:88)
at scala.util.Try$.apply(Try.scala:209)
at examples.hybrid.state.HBoxStoredState.rollbackTo(HBoxStoredState.scala:84)
at scorex.core.NodeViewHolder.$anonfun$updateState$1(NodeViewHolder.scala:230)
at scala.util.Try$.apply(Try.scala:209)
at scorex.core.NodeViewHolder.updateState(NodeViewHolder.scala:226)
at scorex.core.NodeViewHolder.updateState$(NodeViewHolder.scala:220)
at examples.hybrid.HybridNodeViewHolder.updateState(HybridNodeViewHolder.scala:26)
at examples.hybrid.HybridNodeViewHolder.pmodModify(HybridNodeViewHolder.scala:123)
at examples.hybrid.HybridNodeViewHolder.pmodModify(HybridNodeViewHolder.scala:26)
at scorex.core.NodeViewHolder$$anonfun$processLocallyGeneratedModifiers$1.applyOrElse(NodeViewHolder.scala:373)
......
So actually at some moment a node isn't able to do a rollback. A brief look at the issue brought me to this function source code link which, as I can see, isn't fully implemented. It is probably the cause of the crashes (not fully sure though).
Are you also having this problem? Are you going to fix it anytime soon?
@dkaidalov , I have recently executed a few nodes (see #152), and I didn't experience this behaviour. Could you tell us in more detail what you did to see this error?
This error related to HBoxState, not history. @dkaidalov could you provide more details? I'm trying to reproduce.
Indeed, the error is raised in HBoxStoredState, but its reason coming from the non-properly implemented HybridHistory::bestForkChanges
The main problem is that HybridHistory::bestForkChanges returns a ProgressInfo structure with toApply field containing only the head block instead of the whole applyBlocks array.
This, in turn, leads to the situation that not all necessary blocks are applied to HBoxStoredState (because it uses ProgressInfo structure to update its state) and then, at some point, it can't do a rollback, because of unknown branching point.
This is my understanding of the problem, but I could mislead something.
I catch this error very often. Two or more nodes are needed. It raises all the time if block generation is fast. I have 10s block interval. Here is my config:
miner {
offlineGeneration = true
targetBlockDelay = 10s
blockGenerationDelay = 100ms
rParamX10 = 8
initialDifficulty = 10
posAttachmentSize = 100
}
..............................
PosForger:
val InitialDifficuly = 1500000000L
Note that I decreased PoS initial difficulty to speed up block generation
@daron666 actually I was able to reproduce this crash without any changes in configs and with clean master branch (I run 3 nodes) I also noticed that the mentioned exception usually appears after the next error:
java.lang.IllegalArgumentException: requirement failed: Incorrect state version: 5wySxM4eYTLGFbboEeVYoKyPvu3u5C3fScuHoNAxm6Km found, (B78JEtziqmwttysEiab1KRNZ3oaSUYpUQvubfY5KTy1w || 87Phg3xSAwA58cRTZ1n4zTKmZwsjqkKwcijnmmspjsmA || List()) expected
at scala.Predef$.require(Predef.scala:277)
at examples.hybrid.state.HBoxStoredState.$anonfun$validate$1(HBoxStoredState.scala:56)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
at scala.util.Try$.apply(Try.scala:209)
at examples.hybrid.state.HBoxStoredState.validate(HBoxStoredState.scala:51)
at examples.hybrid.state.HBoxStoredState.validate(HBoxStoredState.scala:22)
at scorex.mid.state.BoxMinimalState.applyModifier(BoxMinimalState.scala:29)
at scorex.mid.state.BoxMinimalState.applyModifier$(BoxMinimalState.scala:28)
at examples.hybrid.state.HBoxStoredState.applyModifier(HBoxStoredState.scala:22)
at scorex.core.NodeViewHolder.updateState(NodeViewHolder.scala:248)
at scorex.core.NodeViewHolder.pmodModify(NodeViewHolder.scala:284)
at scorex.core.NodeViewHolder.pmodModify$(NodeViewHolder.scala:271)
at examples.hybrid.HybridNodeViewHolder.pmodModify(HybridNodeViewHolder.scala:22)
at scorex.core.NodeViewHolder$$anonfun$processLocallyGeneratedModifiers$1.applyOrElse(NodeViewHolder.scala:380)
....
To facilitate reproduction you can:
- Set up all nodes with non-zero balances (by setting
seed = "genesisoX") so they can issue PoSBlocks - Decrease block delay
- Decrease initial Pos difficulty
@dkaidalov I guess I've fixed it, test please
@kushti Subjectively it becomes a bit more stable, but the same errors are still reproducible
I noticed that HybridHistory::bestForkChanges still returns a ProgressInfo structure with toApply field containing only the head block instead of the whole applyBlocks array. Is that done on purpose?
Cause I can confirm, according to my testing, that the IncorrectStateVersion exceptions start to appear right after applyBlocks.size > 1 has happened. And it then finally leads to Version Id not found exception