dcrdata icon indicating copy to clipboard operation
dcrdata copied to clipboard

Show stats for historical chain reorgs

Open xaur opened this issue 6 years ago • 2 comments

Decred is known for its resistance to minority forks. Also, the smaller are the reorgs of the chain, the more confident are the users and merchants.

It would be interesting to demonstrate this by showing statistics for all known historical chain reorgs. There is a page for side chains already but it doesn't clearly give the stats like "Decred mainnet chain had 1922 reorgs of depth 1, 25 reorgs of depth 2, 3 reorgs of depth 3, and 0 reorgs of depth >3.".

I could gist the chaintips and then dcrdata could diff what it knows about and make a list of old blocks to import and get. However, any time dcrdata is reset, it would obviously get wiped out again because side chain blocks are never downloaded on fresh syncs. (chat)

The gist: https://gist.github.com/davecgh/e4bc996b839b6abc081457d84afc6bb8

Restoring this data after resets might require some extra storage and logic.

Side chain blocks can also be interesting to researchers, so it would be nice to somehow make them available for download. First thing that comes to my mind is of course Git: I estimate all known sidechain blocks (without the main chain ofc) would make for about ~20 MB Git repo. But that's just one idea, any other way to share the data to researchers will work.

xaur avatar Jan 17 '20 20:01 xaur

OK, we can do this. It will take some exporting of raw block data from various nodes however. Then development of some simple tooling to make the import/export process streamlined for future deployments.

chappjc avatar Jan 17 '20 21:01 chappjc

It is not required to develop generic data sync protocols and software to implement this ad-hoc import/export for side chain data, but I'll note this anyway because it is related and I see how it could serve in multiple applications.

The generic use case is that researchers would benefit from standard ways to share datasets, that are often time series datasets. It can be side chain data, or some interesting data derived from the blocks, like expensive indexes built by dcrdata, or on-chain metrics produced by Checkmate or Permabull Nino. It can also be any off-chain data like market histories or snapshots of social media metrics.

Related: https://github.com/raedahgroup/dcrextdata/issues/187

xaur avatar Apr 09 '20 15:04 xaur