Tracking Issue: Vaporization Checklist
Vaporization is not yet implemented, as there are other features that need to be in place before it makes sense to add it in. Here's how we'd need to do this, from #52:
To do so, we'll need to add a new pass to the compiler, potentially named
last_use.rsor something, that walks the AST backwards and records the first occurrence (i.e. last use) of each variable. The second thing we need to do is add support for Rust'sCowpointers (clone on write) intagged.rs. After that, we'll need to modify the generation step (gen.rs) to take last use into account, and emit instructions for moving out of sharedCowreferences. Finally, the VM will need to add minor runtime support around instructions for copying shared references. I'll open an issue with a tracking list so we can keep track of work wrt that.
Here's that issue! And, as promised, the checklist:
- [ ] Remove mutation in closures, enforce only mutable in local scope rule.
- [ ] Add
last_usepass to compiler. - [ ] Add support for
CoWintagged. - [ ] Modify
gento work off oflast_usepass. - [x] Add instructions that manage
CoWpointers. - [x] Add support for instructions in
vm - [ ] Tests and so on - fine-grained extension to more complex data types, like slices of lists.
For those interested, there's a forever-WIP post on vaporization on my blog: https://slightknack.dev/blog/vaporization/.
In my news channel I just saw a reference to a paper with a simpler SSA algorithm than the widely-known one with just about the same performance characteristic: https://link.springer.com/chapter/10.1007/978-3-642-37051-9_6 . Feel free to take inspiration (I myself don't have the time now to read it though :cry:).
That paper is really well written! I haven't read it before, but this part really struck a chord:
In the following, we describe our algorithm to construct SSA form. It significantly differs from Cytron et al.’s algorithm in its basic idea. Cytron et al.’s algorithm is an eager approach operating in forwards direction: First, the algorithm collects all definitions of a variable. Then, it calculates the placement of corresponding φ functions and, finally, pushes these definitions down to the uses of the variable. In contrast, our algorithm works backwards in a lazy fashion: Only when a variable is used, we query its reaching definition. If it is unknown at the current location, we will search backwards through the program. We insert φ functions at join points in the CFG along the way, until we find the desired definition. We employ memoization to avoid repeated look-ups.
Called it, knew backwards was the way to go! (Emphasis in the quote mine.)
Jokes aside, I'll have to take a closer look at the algo presented in it. Vaporization doesn't require full SSA, just last reuse analysis, but nonetheless it'd be a good idea to design the compiler with the possibility of SSA in mind. Thanks for pointing it my way!