zig icon indicating copy to clipboard operation
zig copied to clipboard

stage2 performance regression regarding struct and packed struct vectorization

Open eLeCtrOssSnake opened this issue 3 years ago • 1 comments

Zig Version

0.10.0-dev.4560+828735ac0

Steps to Reproduce and Observed Behavior

https://godbolt.org/z/6ccjvdK6e This benchmark clearly shows performance degradation between stage1 and stage2. Remove -fstage1 compile argument to see stage2 results. Disassembly shows worse vectorization of struct access, especially the packed struct on stage2.

Expected Behavior

No performance regression.

eLeCtrOssSnake avatar Oct 31 '22 17:10 eLeCtrOssSnake

Performance for this example can be recovered using for(dataset_packed) |*v, k| but you have to be careful to insert a copy in exactly the right place:

for(dataset_packed) |*v, k| {
    dataset_packed[k].a +%= v.a;
    const v_copy = v.*;
    dataset_packed[k].b = v_copy.c and v_copy.d;
    dataset_packed[k].c = v_copy.b and v_copy.d;
    dataset_packed[k].d = v_copy.b and v_copy.c;
}

If v_copy is moved up a line and used for the entire loop body, performance is still bad.

If v.* is not copied at all, then this does not compute the same result as the original code.

topolarity avatar Nov 01 '22 15:11 topolarity