rust icon indicating copy to clipboard operation
rust copied to clipboard

Suboptimal codegen for snippet with Armv7 target

Open vimirage opened this issue 3 years ago • 0 comments

The code generated for this particular function seems quite suboptimal,

pub const fn f(n: u8) -> [u8; 4] {
    match n % 4 {
        0 => [0x0, 0x1, 0x2, 0x3],
        1 => [0x4, 0x5, 0x6, 0x7],
        2 => [0x8, 0x9, 0xA, 0xB],
        3 => [0xC, 0xD, 0xE, 0xF],
        _ => unsafe { std::hint::unreachable_unchecked() }
    }
}

From my observations, for all targets, when written as-is above, it emits a switch table and accesses memory.

For x86-64, if the inner arrays are moved into constants, the switch table is removed, and the code is replaced with arithmetic.

Side-by-side comparisons between x86-64 codegen versus armv7-linux-androideabi: https://godbolt.org/z/ehxabaq38

Here, I was able to manually rewrite the expression into the equivalent of what LLVM emits above: https://godbolt.org/z/qhfaqEcsf

Nothing else seemed to make the compiler emit the specific codegen.

Unknown as to whether this applies to other output targets.

@rustbot label A-LLVM I-slow

vimirage avatar Jun 16 '22 04:06 vimirage