flow-dps Performance: Split data sources for indexing

In order to prepare for augmenting the DPS index data, it would be good to separate chain data and DPS index data. The DPS index will only contain the Ledger payloads. Everything else, which is basically just transcoded data from Flow databases, will go into a separate database.

I recommend that we use the following nomenclature:

DPS ledger index/database
DPS entity index/database
DPS insight index/database

Sep 11 '21 11:09 awfm9

Right now, the biggest performance issue when it comes to the Live Indexer is that catching up is really slow. Splitting the data we read into multiple databases can also improve the performance for the static indexer, of course.

The idea would be to have three distinct databases we read from:

The protocol state database from consensus nodes
A database specifically for trie updates
A database specifically for the rest of the data from the execution nodes

By having those three databases instead of a single one, we can parallelize requests and reduce the friction in our current bottleneck, which is the indexing of transactions and trie updates.

Things to clarify

~How to split the data into multiple databases? We most likely don't want to go through all of it and clone it, as that would take as much time as actual indexing.~

Sep 28 '21 09:09 Ullaakut

I think one important point that is missing from your summary is that the protocol state database is the reference protocol state, as found on consensus nodes; not execution nodes. Other than that, it sounds excellent :+1:.

Regarding the approach, I think we would need to switch to a design with two writers, and possible 2-3 readers (though the readers could remain unified as one, just with three DB dependencies). Both writers (execution data, ledger data) would flush at badger transactions at the same interval, except that they would be offset by half an interval so they are interleaved optimally.

Sep 28 '21 09:09 awfm9

@awfm9

I think we would need to switch to a design with two writers

How would you suggest we split them and how would they be used by the mapper? I guess we would need the mapper to run both indexing tasks in separate goroutines, and that might slightly complicate the error handling part of things.

and possible 2-3 readers (though the readers could remain unified as one, just with three DB dependencies)

So assuming that we take a new DB as input with the non-trie-update data from the exec nodes, we would now have:

The LedgerWAL consumed by the Feeder
The Protocol State DB consumed by the Chain, which would have a slightly reduced scope in what it fetches from the DB
The other db consumed by a new kind of reader, which would implement part of what the Chain currently does

Is that right? Having multiple databases but accessing them sequentially provides no performance improvement, so once again I assume that we'd want to somehow make the mapper fetch this data concurrently.

Sep 30 '21 07:09 Ullaakut

Change Performance Impact

Badger v3 Upgrade None

Split up reader +14%

Split up writer +5%

Change	Performance Impact
Badger v3 Upgrade	None
Split up reader	+14%
Split up writer	+5%

Badger v3 Upgrade

This change was a simple switch of the Badger version used for our index database.

Split up reader

This change was made (in a quick and dirty way) by cloning the protocol state DB and opening both of them with two separate readers so that it can be read in parallel by the mapper. It resulted in around a performance increase of around 14% on my machine.

One reader reads specifically the Transaction Results, Events and Seals (execution data) while the other reads everything else.

Split up writer

This change was made by no longer writing in one but three separate badger databases for indexing. This allows to write on both the protocol index and the chain index in parallel. Unfortunately on my machine, this only resulted in a performance increase of around 5%.

One writer writes specifically the Transaction Results, Events and Seals while the other writes everything else but trie updates. Depending on the blocks, both operations seem to take roughly the same amount of time, but sometimes indexing consensus data takes about twice longer. Still, assuming they'd both take exactly the same amount of time always, we'd only save up another 5% at best. At least, with the dataset I have. Maybe this is more significant with real data.

Example:

logs.json

{
  "level": "info",
  "component": "mapper_transitions",
  "duration": 3.5319,
  "db": "execution",
  "time": "2021-10-01T08:37:49Z",
  "message": "Finished indexing goroutine"
}
{
  "level": "info",
  "component": "mapper_transitions",
  "duration": 5.7975,
  "db": "consensus",
  "time": "2021-10-01T08:37:49Z",
  "message": "Finished indexing goroutine"
}
{
  "level": "info",
  "component": "mapper_transitions",
  "duration": 3.4241,
  "db": "execution",
  "time": "2021-10-01T08:37:49Z",
  "message": "Finished indexing goroutine"
}
{
  "level": "info",
  "component": "mapper_transitions",
  "duration": 6.3003,
  "db": "consensus",
  "time": "2021-10-01T08:37:49Z",
  "message": "Finished indexing goroutine"
}

Sep 30 '21 09:09 Ullaakut