regalloc2 Rematerialization

As an alternative to spilling and reloading, a vreg value can sometimes be recomputed from other available values. There are two ways this can be supported in regalloc2:

Simple rematerialization

Supporting rematerialization for constants, including runtime constants which only depend on pinned registers (e.g. VMContext), is relatively straightforward. The client needs to track rematerialization metadata for each VReg, after which we can elide spills for this vreg (unless there is a stack use) and replace reloads with rematerializations.

trait Function {
    fn can_rematerialize(&self, vreg: VReg) -> bool;
}

enum Edit {
    /// The code sequence for rematerialization is only allowed to
    /// use `dest` and the dedicated scratch register.
    Rematerialize {
        vreg: VReg,
        // This is not an Allocation because rematerializing
        // into a stack slot doesn't make sense.
        dest: PReg,
    }
}

Complex rematerialization

Rematerialization involving values in other vregs is more complex and probably not worth the effort of implementing. It has been done before as part of a research project on V8 but AFAIK this has not made it into the main V8 codebase.

Sep 01 '21 22:09 Amanieu

I'm currently looking to implement the simple rematerialization scheme I described above. Here's a summary of the changes I think are needed:

In compute_liveness, don't add uses to a liverange for block parameters on branches. These aren't real operands so it's fine if the allocator assigns Allocation::none to them. This seems to already be the case for moves. This is needed when a vreg has no current location but will be rematerialized later.
In spill_weight_from_constraint, assign a spill weight of 0 to defs of rematerializable vregs and half of the usual value for uses.
LiveBundle needs a "remat" flag if it is associated with a rematerializable vreg. This flag is cleared if the bundle is merged with another vreg or if the bundle has any stack uses.
In process_bundle, treat Requirement::Any as Requirement::Register for "remat" bundles. We prefer allocating Any constraints to registers when a stack slot is not required.
In try_allocating_regs_for_spilled_bundles, don't mark a spillset as required for spilled bundles that have the "remat" flag.
In get_alloc_for_range, return an AllocationKind::None that is tagged with the source vreg for spillsets that have no associated spillslot. This is only possible if the range refers to a rematerializable vreg and the range has no uses. Rematerializable vregs that have a spillslot (due to stack uses or merging with another vreg) still use AllocationKind::Stack.
Redundant move elimination should be able to remove redundant rematerializations since it can recognize the tagged AllocationKind::None as different but this needs to be double-checked.
In resolve_inserted_moves, eliminate moves whose destination is AllocationKind::None.
In resolve_inserted_moves, convert moves whose source is AllocationKind::None or AllocationKind::Stack to Edit::Rematerialize if they come from a rematerializable vreg. This requires changing moves to track from_vreg instead of to_vreg which will require an API change. Cranelift only uses this information to get the type of a vreg so this shouldn't be a problem.

@cfallin Does this seem reasonable to you?

Sep 05 '21 10:09 Amanieu

@Amanieu: yes, overall, this looks quite reasonable! Thank you very much for working on this.

A few thoughts:

I like the clean zero-spill-cost approach. One could imagine an alternative approach where we edit the liveness info in some way (delete liveranges?) to represent the unneeded dataflow, equivalent to a new def at the remat point. That has some appeal but significantly changes the macro-level ordering of operations; better to keep the bundles but make them phantoms that live nowhere and remat a value when needed.
The "clear remat flag on bundle merge" may be overly pessimizing, depending on the reason for the merge. If we're merging across a move, it should be fine to remat the destination of the move -- in this case we want to propagate any remat metadata instead ("rematerialize v25 according to the remat instructions for v12" or somesuch in the Edit). Moves may not be as relevant for SSA input (e.g. I don't think your use-case requires them at all) so maybe not as immediately important for you.

I agree re: above that "complex remat" (repeating insts with other vreg inputs) is a much deeper project and probably not necessary for a large chunk of the benefit (namely, constant values and maybe global/argument/other environment-context loads). It might be worth looking into what other compilers do here at some point, out of curiosity if nothing else -- I don't know much about the state of the art in remat.

Anyway, happy to talk more as you look at this!

Sep 07 '21 04:09 cfallin

The "clear remat flag on bundle merge" may be overly pessimizing, depending on the reason for the merge. If we're merging across a move, it should be fine to remat the destination of the move -- in this case we want to propagate any remat metadata instead ("rematerialize v25 according to the remat instructions for v12" or somesuch in the Edit). Moves may not be as relevant for SSA input (e.g. I don't think your use-case requires them at all) so maybe not as immediately important for you.

The problem is that any other vreg we merge with will almost certainly not be rematerializable on its own, usually because it has multiple possible values (e.g. a block parameter, but this also applies to moves). The rematerialization only applies to one of these values but at that point we don't know which value that vreg will take.

Sep 07 '21 08:09 Amanieu