compiler_rt: update memcpy to compare usizes at a time

Open nektro opened this issue 3 years ago • 2 comments

cc @mikdusan noticed this improving stage2 compilation performance of stage3 by slightly more than 2x on macOS when using our own implementation

libSystem.memcpy: 299.05 seconds memcpy.zig: 996.01 seconds memcpy_usize.zig: 443.22 seconds

Oct 22 '22 18:10 nektro

further notes:

memmove can be modified to call this function
memcpy can be modified to use this same optimization

Oct 22 '22 18:10 nektro

Couldn't this use SIMD to go in chunks of 16/32 and copy that way? Should be straightforward with @Vector

Oct 22 '22 19:10 Jarred-Sumner

Uhm, both the title and the code comment mention "comparing" usizes - that should read "copying", right? Did the first comment also mean to mention memcmp instead of memcpy? Although 4 people have already looked at this, so maybe I'm bugging.

Oct 23 '22 20:10 rohlem

no @rohlem you're right

Oct 23 '22 21:10 nektro