Mempool chainstate synchronization issues
(Created on Aug 31, 2023 by @iljakuklic)
Mempool and chainstate operate concurrently. When a transaction is processed, mempool employs the transaction verifier. The verifier in turn performs a series of chainstate queries. However, the state of the blockchain may change between the individual queries so it may happen that a transaction is validated with respect to an inconsistent blockchain state.
Currently, this is mitigated by checking the tip before and after the verification. If it does not match the verification is re-attempted up to a certain small number of times. This should almost always work since blocks are not supposed to come in very often by computer time standards. It is, however, still somewhat suboptimal and improvements can be made. Besides having an inconsistent view of the state, performing many inter-subsystem queries incurs non-trivial synchronisation overhead.
A possible improvement is to collect all the pieces of information required to verify the transaction in one query and store it in a data structure. The data structure can be then used as a storage backend for transaction verifier. That way, the number of subsystem calls is minimised and we get consistent state to query against. In addition to that, the BlockingHandle machinery to call subsystems from non-async contexts would likely be no longer needed.
I feel like there's a mixup in this description between two problems, one of them never has a solution and the other does have a solution.
Problem 1: Synchronization between the mempool and the chainstate. In the most general terms, there's no solution to this problem. This is the classical CAP theorem problem. The system we created (the subsystems manager) is consistent and partition tolerant. It can never be always available, which seems to be the requirement in this issue as I understand it. Now the solution proposed here (preparing data), unless I misunderstand something, doesn't make sense. Because it creates a trade-off between availability and consistency in CAP. Sure, you can make the operation more "available" by pre-caching information, but then the data we use will never be consistent with chainstate, because chainstate can still receive a new block in concurrently, and that new block will invalidate the current state of the mempool. So preparing data doesn't really help in solving the synchronization issue.
Problem 2: Optimizing the communication between the mempool and chainstate. Now that is completely correct and should be done, but this isn't gonna fix the synchronization issue. It's purely a performance fix that I totally agree with.
(Comment by @iljakuklic)
The claim is that it is an improvement if the data the transaction is validated against is a snapshot of the state at particular block (not necessarily the most up-to-date one). Currently, we have the situation where transaction can be validated against state that is a mixture of two (or more) states between blocks. Sure it is addressed by re-checking if the tip has moved in the meantime but still eliminating this case is IMHO preferable.
It does not guarantee chainstate and mempool have at all times the view of the state that consistent with each other but at least mempool has a view of an internally consistent snapshot of the state.
(Comment by Sam)
I agree. My contention was that the distinction wasn't clear.
This is also related to the problem of huge logs in p2p tests.
The background: Our tests currently may generate huge logs and their size may vary greatly from run to run. In particular we had a case where running tests on a (very slow) Windows machine produced 250Mb of logs. And running them on Ubuntu on the same commit produced "only" 12Mb (still a lot). Both log files were flooded with lines "Adding transaction ...", "Transaction rejected: Orphan transaction error: Orphans not supported for transactions originating at reorged-out block" and "Disconnected transaction ... no longer validates" (the 250Mb one basically consisted of these lines).
The culprit seems to be the initial_download_unexpected_disconnect test in p2p, which generates 1000 blocks on one test node and sends them to another one. The blocks are 10 minutes apart (in mocked time), so the last ~144 of them are "today's" (and therefore they are considered "fresh").
The problem lies in the fact that mempool can be out of sync with the chainstate - when mempool receives an event from chainstate (which may be delayed and therefore represent a point in the past) it will still use its current state while processing the event (e.g. the current utxo set, best block id, is_initial_block_download flag). And the slower the machine is, the higher is the chance for them to become out of sync.
So what happens is:
- An outdated "new tip" event arrives to mempool and it tries to reorg this "new" block over the actual tip, which is considered to be an "old" one.
- The disconnected transactions from the "old" tip get added to the mempool, this is where "Adding transaction" comes from.
- But the utxos that they spend don't exist, so they are considered orphans, and since their origin is "local", they are rejected; this this where the "Transaction rejected" and "Disconnected transaction" come from.
In the aforementioned "ubuntu" case this was happening for the last ~144 blocks, because the mempool avoids adding back transactions while "is_initial_block_download" is true. But in the "windows" case it was even worse - because the machine was so slow, the mempool seems to had started receiving "New tip" messages when chainstate already had almost 900 blocks and the "is_initial_block_download" was already false. So, this was happening for the whole 1000 of blocks.