cuSZ Performance difference between cuSZ and SZ

Hi Team,

Have you observed any significant differences in compression ratio by using SZ and cuSZ? (the same data, error mode and error bound).

May 12 '21 17:05 mawpolaris

Hi,

cuSZ applies Huffman coding (customized, of multibyte symbols, CPU-SZ the same) after prediction-quantization while CPU-SZ applies Huffman coding + DEFLATE (gzip). By exploiting repeated pattern CPU-SZ achieves higher compression ratio (CR). For example, on some smooth/sparse data fields, CR can go beyond 32x (if float32 as input), while Huffman encoding without pattern finding can only achieve at most 32x by having 1-bit codewords. (Besides, blocking for the purpose of parallelizing creates slight difference.)

If you need observe the highest possible compression ratio (in practice we trade off this with high kernel throughput), here are several ways may help,

enable gzip and/or nvcomp (pre-alpha, may fail)

cusz -t f32 -m r2r -e 1e-4 -i ./data/ex-cesm-CLDHGH -2 3600 1800 -z --gzip cusz -t f32 -m r2r -e 1e-4 -i ./data/ex-cesm-CLDHGH -2 3600 1800 -z --nvcomp cusz -t f32 -m r2r -e 1e-4 -i ./data/ex-cesm-CLDHGH -2 3600 1800 -z --gzip —nvcomp

simply tar zcf <cusz-archive>

Hope this answers your question.

On May 12, 2021, at 12:33 PM, mawpolaris @.***> wrote:

Hi Team,

Have you observed any significant differences in compression ratio by using SZ and cuSZ? (the same data, error mode and error bound).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

May 12 '21 17:05 jtian0

Hi Nate,

Thank you for your response. That make sense.

To clarify, SZ (v2.1+) and cuSZ (current version, version number?)

share the same predictors? or compression quality optimizer?
share the same linear scale quantization?
have different Huffman encoding approaches and dictionary encoding?

Jun 01 '21 21:06 mawpolaris

Hi @mawpolaris,

For the time being, we can say cuSZ release 0.2.2 onward (as the updates only enhances performance). In general, SZ 2.1 is far more mature than cuSZ in (1) having preprocess, more compression mode (e.g., point-wise) and autotuning, and (2) having Lorenzo predictor and Linear Regression, whereas cuSZ has Lorenzo (we are working on new predictors).

They share the same Lorenzo predictor. However, there are many factors that affect data quality as quality optimizer.

preprocessing such as log transform and point-wise transform
PSNR as a goal to autotune eb
initial values from which we predict border values (as if padding). cuSZ predicts from zeros while SZ determines optimal values for e.g. application-specific metrics. Please also note that naive setting of zeros can result in a significant higher PSNR than SZ (with the same eb), as is pointed out in Table 8 on page 10 of this doc, but it is not necessarily better when it comes to applications: it's data dependent.
The PSNR as a generic metric can be used this way: SZ guarantees a lower-bound of PSNR when the eb is relative to the data range, e.g. 64 for 1e-3, 84 for 1e-4.

The linear scaling can be the same. SZ has extra optimizer to decide the linear scaling range $[-r, +r]$; out-of-range quantization values are outliers. This is to optimize compression ratio.
Currently, the Huffman encoding is the same except cuSZ partitions data (therefore it has overhead in padding bits and partitioning metadata).

I will also try to update an FAQ to address these problems.

Jun 06 '21 00:06 jtian0