zlib Add optimization for Adler32 checksum for Power processors

Hi,

This PR introduces a optimization for Adler32 checksum for POWER8+ processors that uses VSX (vector) instructions.

If adler32 do 1 byte at time on the first iteration s1 is s1_0 (_n means iteration n) is the initial value of adler, at beginning _0 is 1 unless adler initial value is different than 1. So s1_1 = s1_0 + c[0] after the first calculation. For the next iteration s1_2 = s1_1 + c[1] and so on. Hence, for iteration N, s1_N = s1_(N-1) + c[N] is the value of s1 on after iteration N. Therefore, for s2, s2_N = s2_0 + Ns1_N + Nc[0] + N-1*c[1] + ... + c[N] In a more general way:

s1_N = s1_0 + sum(i=1 to N)*c[i]

s2_N = s2_0 + N*s1 + sum (i=1 to N)(N-i+1)*c[i]

Where s1_N, s2_N are the values for s1, s2 after N iterations. So if we can process N-byte at time we can obtain adler32 checksum for N-byte at once. Since VSX can support 16-byte vector instructions, we can process 16-byte at time using N = 16 we have:

s1 = s1_16 = s1_0 + sum(i=1 to 16)c[i]

s2 = s2_16 = s2_0 + 16*s1 + sum(i=1 to 16)(16-i+1)*c[i]

The VSX version starts to improve the performance for buffers with size >= 64. The performance is up to 10x better than Adler32 version from adler32 non-vectorized version (average cpu time in ns on 100000 iterations):

buffer size	adler32 baseline	adler32 power	speedup
64	44.921875	41.015625	-
1024	943.359375	130.859375	7.2
10*5552	42519.531250	3974.609375	10.7

For buffer with length <= than 64 the performance is almost the same of the non-vectorized implementation (with a small performance degradation in some cases):

buffer size	adler32 baseline	adler32 power
NULL	5.859375	6.812500
1	3.906250	4.859375
15	11.718750	12.625000
48	35.156250	33.203125

Dec 10 '19 19:12 racardoso

FYI this PR uses the same base commit as #457 to add base code for Power optimizations. When either one gets accepted, the other can be rebased to remove the first commit from the PR.

Dec 10 '19 20:12 mscastanho

A long time ago, I have done this ticket:

https://github.com/madler/zlib/issues/847

May 23 '25 00:05 Neustradamus