[NFC] Add a lazy mode to LocalGraph
LocalGraph by default will compute all the local.sets that can be read from all
local.gets. However, many passes only query a small amount of those. To
avoid wasted work, add a lazy mode that only computes sets when asked about
a get.
This is then used in a single place, LoopInvariantCodeMotion, which becomes 18% faster. Later PRs will use it in more places.
The surgery to LocalGraph.cpp is NFC in that it just splits up the main flow()
function into prepareFlowBlocks(), which we also run in lazy mode, and flow()
which does the eager (non-lazy) flow. All the movement of code is just to get
stuff out of the old flow() into the class scope for that purpose. No changes
are made to the algorithm. Diff without whitespace is slightly smaller.
(Existing tests cover most of this, but I noticed that unreachable code was missing, so that is added.)
It would be helpful to split this up into a preliminary PR that just moves existing code around, followed by a separate PR adding the laziness.
That would split this NFC PR into 2 NFC PRs, but the first one would have a bunch of comments "this function/this global var/etc. will be useful in a later PR". Does it still make sense to you to split, given that?
Can we reimplement the eager version in terms of the lazy version? That would also reduce the amount of new code.
That would have been good, yeah, but it would be less efficient. The eager version does a more efficient job of computing it all than just calling the lazy version iteratively on everything.
It would be helpful to split this up into a preliminary PR that just moves existing code around, followed by a separate PR adding the laziness.
That would split this NFC PR into 2 NFC PRs, but the first one would have a bunch of comments "this function/this global var/etc. will be useful in a later PR". Does it still make sense to you to split, given that?
I don't think we necessarily need to commit new comments just to delete them afterward, but yes, I think moving things to class scope in an existing PR would make things easier to understand.
Can we reimplement the eager version in terms of the lazy version? That would also reduce the amount of new code.
That would have been good, yeah, but it would be less efficient. The eager version does a more efficient job of computing it all than just calling the lazy version iteratively on everything.
What's the difference look like?
I don't think we necessarily need to commit new comments just to delete them afterward, but yes, I think moving things to class scope in an existing PR would make things easier to understand.
It would be strange not to explain things in the middle, though? There would be a bunch of code in a form that makes no sense until it is also used by another place (the lazy mode). Without that extra caller, the code looks weird and in need of refactoring, I think. But, the justification would be "a future PR" - that's why this feels weird to me.
What's the difference look like?
In eager mode we process blocks as a whole, handling each get and set as we see it, and also bundling gets together when possible. That means a single iteration on each block. In lazy mode each get will end up iterating in the block backwards (though there is an optimization to stop if we reach a get/set of the same index).
Last commit removes the laziness-related things here as requested, leaving only a refactoring that splits up flow() and calls it on a LocalGraphFlower instance. I do still think that is a little odd to split up when I look at the code, but I don't feel strongly.