lineiform Use a different IR than Cranelift

move |a: usize| {
    let val: usize;
    if Wrapping(a) + Wrapping(0x5) < Wrapping(a) {
        val = 1;
    } else {
       val = 2;
    }
    val
}

---- cmp rsi, -0x5
---- mov eax, 0x2
---- adc rax, -0x1
---- ret

function u0:0(i64, i64, i64) -> i64 system_v {
block0(v0: i64, v1: i64, v2: i64):
    jump block1

block1:
    v3 = iconst.i64 -5
    v4, v5 = isub_ifbout.i64 v2, v3
    v6 = iconst.i64 2
    v7 = iconst.i64 -1
    v8, v9 = iadd_ifcarry v6, v7, v5
    return v8
}

(Ignore the weird block0) This errors when compiling with thread 'test::test_passthrough_usize' panicked at 'ALU+imm and ALU+carry ops should not appear here!', /home/charlie/.cargo/registry/src/github.com-1ecc6299db9ec823/cranelift-codegen-0.75.0/src/isa/x64/lower.rs:5977:13

Technically for this specific case I could turn the cmp into a Cranelift icmp_imm instruction, but later could could be depending on the rsi (v4) value. I'd have to re-emit an isub instruction right before its use, which I can do in this specific case, but really this is just a sign that Cranelift's IFLAGS mechanism plain doesn't work for me. The bigger issue is that Cranelift assumes that all math ops clobber all IFLAGS, so you can't have, for example

---- cmp rsi, -0x5
---- lea eax, [rsi+5]
---- mov eax, 0x2
---- adc rax, -0x1
---- ret

since it would say that the flag produced by the cmp would be invalid after the iadd from the LEA. I'd have to do my own code movement and instruction picking to make it happy, and even then Cranelift flag rules are hard enough to use it's probably impossible.

I'm not entirely sure where to go from here: I probably should've just been using LLVM this entire time, since that's what a lot of other x86_64 lifters target for a backend. Alternatively, I could try to use dynasm-rs: it doesn't have any optimizations, but then Cranelift barely does either (and it's codegen is often worse than just re-emitting the same instructions!). By lifting and lowering to the exact same instructions, nothing would clobber flags - at worse I'd have to emit stc; adc if I constant fold the add part of an add; adc, and could probably do something smarter instead.

Sep 20 '21 01:09 chc4

This has it's own set of issues - the most important being that I have to do my own register allocation now instead of being able to steal Cranelift's (which was the main motivation of using it!). dynasm-rs does support dynamic registers for operands, but you need to give it numbers for which one to use. This probably means I need my own IR as well, since register allocation is usually done in a bottom up manner to recover live ranges, instead of one-pass top down (you can technically do a greedy register allocation, where everything is in a register by default and spilled in LRU manner when that's full, but it will have horrible performance characteristics on long functions or any loops...). We also have the problem where we can't spill values to the stack naively, since the assembly will be using that too and require contiguous ranges for stack buffers! So we need to know the full spill list when we start code-gen, so we can pass the same stack buffers to functions the same way.

Some people in the langdev discord channel have been talking about RVSDG, which is a hybrid CFG/DFG representation. We care about the DFG side more than the CFG side (since we already have a function layout graph from what we lift, we'd just like to optimize to a better one if it's possible), so it might be a good fit.

Sep 20 '21 01:09 chc4

Started development on the tangle branch.

Sep 26 '21 18:09 chc4

I merged the tangle branch into mainline; it isn't isn't nearly good enough to switch to yet, but it does implement a register allocator (without spilling support) and Technically Works(tm).

Feb 17 '22 03:02 chc4