Allow longer state pruning history
Currently when state pruning is enabled state-db reqeuires O(n) memory w.r.t. the number of blocks pending pruning. Therefore long pruning history requires a lot of memory. It should be possible to get rid of the memory requirement using the reference counting feature of parity-db.
This basically means removing death_rows and death_index fields in RefWindow struct. Journals that are currently kept in death_rows should be loaded on demand from the database, although I'd keep an in-memory cache for pruning window size <= 256. death_index should be replace with reference counting on the DB level.
So that first requires that we get rid off Rocksdb?
@bkchr Not necessarily. There's a compatibility layer that adds reference counting support for rocksdb as well, albeit inefficiently.
I'd keep an in-memory cache for pruning window size <= 256.
death_indexshould be replace with reference counting on the DB level.
Or perhaps the latest (up to) 256 blocks in in-memory cache and the rest loaded on demand from db?
death_rows is not actual block data, but a journal. A list of keys that would need to be removed from the database when the block is purged. There's no point in keeping recent journals in memory. When you insert block N and need to prune block N - 10000000 which was inserted months ago, the journal won't be in the cache. Only with small pruning window can you keep journals in memory and avoid a database query.
Hi @arkpar , looks like RefWindow already utilized the reference counting feature of parity-db by setting RefWindow::count_insertions to false, which can eliminate the memory needed by RefWindow::death_index (also the disk space of JournalRecord::inserted)
But DeathRow::deleted may still be required, because in parity-db the reference counter is placed alongside with the referencing kv pair, and upon pruning we still need a way to keep track of which kv needs to be deleted before we can actually delete them on the backend db.
Hi @arkpar , looks like RefWindow already utilized the reference counting feature of parity-db by setting RefWindow::count_insertions to false, which can eliminate the memory needed by RefWindow::death_index (also the disk space of JournalRecord::inserted)
Right, this part is already implemented :)
But DeathRow::deleted may still be required, because in parity-db the reference counter is placed alongside with the referencing kv pair, and upon pruning we still need a way to keep track of which kv needs to be deleted before we can actually delete them on the backend db.
It is required indeed, but we don't need to keep it in memory. On each block import a journal record is written to the database here which contains a list of deleted keys. When it is time to prune a block we can get that record from the database.
It is required indeed, but we don't need to keep it in memory. On each block import a journal record is written to the database here which contains a list of deleted keys. When it is time to prune a block we can get that record from the database.
Right, I would like to have a try.