Add a `graphman prune` command
The command will have the form graphman prune <subgraph> <offset> and remove all entity versions that were deleted or updated at least offset many blocks before the current subgraph head; for a pruned subgraph, the index-node API will report that block number as the earliest bock, and queries at a block height before that block will fail with an error. Pruning will not affect query results for queries at a block after the pruned block, and therefore simply limits how far back time-travel queries can reach.
Since pruning removes a huge amount of data that is usually not accessed by queries, it speeds up queries significantly.
As part of this issue, pruning will be a one-time action, i.e., it only removes history at the point in time when it is run.
@lutter on the "earliest block", I wonder if that might break some application's assumptions (which might rely on that for example to identify how synced a subgraph is)
@lutter on the "earliest block", I wonder if that might break some application's assumptions (which might rely on that for example to identify how synced a subgraph is)
The earliest block number would still be accurate, it's just that the hash for the earliest block gets filled with a dummy value. Right now, that value is always the start block for the subgraph; with pruning the block number could/would move up over time. But I have no idea if anybody is relying on the block hash for anything.