binaryen icon indicating copy to clipboard operation
binaryen copied to clipboard

[Do not land] Experiment with read-only-to-write for wasm GC

Open kripken opened this issue 3 years ago • 4 comments

This PR documents things I found when trying to add read-only-to-write type optimizations for wasm GC, in case it might be useful in the future. Atm I don't intend to work on this myself as it is not a small amount of work and the expected benefit in j2wasm is not large.

SimplifyGlobals has "read-only-to-write" optimizations, that operate on things like this:

global x = 0;
fun foo() {
  x = x + 1;
}
fun bar() {
  x = x - 1;
}

The global x here has both reads and writes, but all the reads are only for the purpose of writing the value back in. As a result, the global is not actually observed outside of the read-only-to-write patterns x = x + 1 and x = x - 1, and so we can actually remove that global entirely.

A similar operation could be done on wasm GC fields. A concrete example is a refcount field, if we can somehow remove all checks of the refcount except for the increments and decrements. In general any decrement should check if the recount is 0 and then free if so, so for us to optimize this we'd need to remove those checks somehow. I'm not sure how that would happen in real-world code but it does at least in some cases in j2wasm.

In principle this could be done, but I ran into the following issues:

  • GlobalTypeOptimization tracks whether we have reads and whether we have writes. But for this, we'd need to track individual reads and writes because we want to check which reads flow into writes and nowhere else. I tried to detect the entire pattern at once, and issue a notification about a read-only-to-write (a read and a write) and not separate reads and writes, which is what is in this PR.
  • That however is not good enough, as it ignores subtyping. The code here looks for a read and a write of the same type, but operations on sub/supertypes may be relevant. We do handle subtyping in general in GlobalTypeOptimization, of course, but we do it by tracking "precise" info (on struct.new, where the type is known exactly) separate from "imprecise" info (on gets and sets). Those two separate stores of info would need to be merged first before we can tell which read-only-to-write patterns exist. It's not clear to me if it's best to track individual gets/sets here, or change the entire model of precise/imprecise info, but either would be nontrivial.

kripken avatar Mar 21 '22 18:03 kripken

Could this be extended to entire groups of globals that a mutually only ever read from for the purpose of be written back to?

I found the "read-only-to-write" terminology initially confusing because it sounds like a conversion pass: "read-only" -> "write" :)

sbc100 avatar Mar 21 '22 18:03 sbc100

Could this be extended to entire groups of globals that a mutually only ever read from for the purpose of be written back to?

I think so, interesting idea. If we think that is common we could look into it.

I found the "read-only-to-write" terminology initially confusing because it sounds like a conversion pass: "read-only" -> "write" :)

Ah, I didn't think about that, but now that you mention it I see what you mean...

kripken avatar Mar 21 '22 21:03 kripken

It would be cool to have a global analysis to provide detailed information on where every read (of any kind) flows to. Then the global type optimization and various other optimizations could be built on top of that. In principle, improvements to the analysis (or compromises to improve performance) would benefit all of the optimizations built on top of it.

tlively avatar Mar 21 '22 21:03 tlively

@tlively Agreed. For now it's been simpler to write specific flow analyses in separate passes to see how far that gets us, but "one big analysis to rule them all" might help more.

It is a little hard to estimate how much better we could get with such an approach. One way to estimate it is to check how much it would help type refining. I think the precise type info idea would help there since once we have that in the type system all the existing passes will be "upgraded" as if we had a single big analysis, at least minus "cycles". For "cycles", we have that other idea related to precise type info, I forgot what we called it, but it propagated even partial info. Those things would get us "one big analysis" for type propagation, and I'd say if we get big wins there then it could justify "one big analysis" for other things like GC fields etc.

kripken avatar Mar 21 '22 21:03 kripken