MGARD Rewrite Huffman Coding Implementation

These commits

refactor the original Huffman coding implementation,
add regression tests for huffman_encoding, huffman_decoding, compress_memory_huffman, and decompress_memory_huffman;
remove the timing statements for compress_memory_huffman;
add new Huffman coding functions, huffman_encode and huffman_decode;
add a new Huffman code serialization method, RFMH, and new lossless compression methods, CPU_HUFFMAN_ZLIB and CPU_ZSTD (renaming original CPU_HUFFMAN_ZLIB to CPU_ZLIB);
fix an asymmetry between zlib and ZSTD compression (CPU_ZLIB, formerly, CPU_HUFFMAN_ZLIB, didn't use Huffman coding, while CPU_HUFFMAN_ZSTD did);
reimplement compress and decompress, adding support for the new lossless compression methods and the new Huffman code serialization method;
rename include/compressors.hpp and the zlib and ZSTD compression functions;
split the frequency and 'missed' buffers into sequences of subbuffers of limited size so that Protobuf doesn't complain (I was getting errors linked to this);
bump the file format version number to 1.1.0 and the MGARD version number to 1.3.0; and
add a lot of tests.

Jun 28 '22 21:06 ben-e-whitney

@qliu21, please don't merge this pull request yet. I'll rebase once you merge #194 and #195.

Jun 28 '22 21:06 ben-e-whitney

@JieyangChen7, I am going to need a little bit of help getting this pull request ready to merge. I just rebased on #197. When I try to build, I get the following error message.

$ cmake -S . -B build -D CMAKE_PREFIX_PATH="$HOME/.local" -D MGARD_ENABLE_SERIAL=ON
$ cmake --build build --parallel 8
…
lib/libmgard.so.1.3.0: undefined reference to `mgard::huffman_decoding(long*, unsigned long, unsigned char*, unsigned long, unsigned char*, unsigned long, unsigned char*, unsigned long)'
lib/libmgard.so.1.3.0: undefined reference to `mgard::huffman_encoding(long*, unsigned long, unsigned char**, unsigned long*, unsigned char**, unsigned long*, unsigned char**, unsigned long*)'
collect2: error: ld returned 1 exit status
CMakeFiles/mgard-x-autotuner.dir/build.make:107: recipe for target 'bin/mgard-x-autotuner' failed

(This is on 6e281a6646a19e87cdd4382c773cda3d8b14edbf.) Here's the issue, as far as I can tell.

In include/mgard-x/CompressionLowLevel/CompressionLowLevel.hpp, compress calls CPUCompress and decompress calls CPUDecompress.
In include/mgard-x/Lossless/CPU.hpp, CPUCompress calls mgard_x::compress_memory_huffman and CPUDecompress calls mgard_x::decompress_memory_huffman.
mgard_x::compress_memory_huffman calls mgard::huffman_encoding and mgard_x::decompress_memory_huffman calls mgard::huffman_decoding.
I've removed those functions. It's possible to achieve the same effect with compress and decompress (declared in include/lossless.hpp), but that functionality is deprecated because of issues with the original Huffman coding implementation (see #190).

I would just modify CPUCompress and CPUDecompress to use the new Huffman coding implementation myself, but any code using the new implementation will need to interact with some new Protobuf fields I've introduced, and I don't know how to do that with your code. That's why I'm roping you in. Do you have any time before the 19th (the next MGARD software engineering meeting) to meet and figure out what changes are needed?

Jul 12 '22 19:07 ben-e-whitney

@ben-e-whitney Ok, let me take a look at this commit first and figure out the possible solutions. I will let you know.

Jul 13 '22 14:07 JieyangChen7

@ben-e-whitney I managed to fix this issue by calling the new compress/decompress API in include/mgard-x/Lossless/CPU.hpp. No major changes are necessary from your side.

There is one minor change needed. Since the new compress API returns mgard::MemoryBuffer, which requires me to indirectly include "utilities.hpp", it triggers the NVCC bug. So, I added #ifndef __NVCC__ around the member functions related to mgard::CartesianProduct to make it work. Do you think this is fine?

Those changes are on my local machine. If you are ok with those changes, I can push the changes directly to this huffman-decoding branch or I can let you know what is changed and you make a commit?

Jul 14 '22 15:07 JieyangChen7

@ben-e-whitney I noticed that the new lossless compress/decompress implementation has a lower throughput compared with the old implementation. Here are my results with the same data (134 million quantized elements) and settings (Zstd with level=1). I timed just the lossless compress/decompress functions. Is this performance expected? I want to make sure I'm using the functions in the right way. Old: compression: 0.51s (huffman: 0.47s + Zstd: 0.03s), decompression: 0.54s (huffman: 0.52s + Zstd: 0.01s) , compression ratio: 208x New: compression: 2.11s (huffman: 2.07s + Zstd: 0.03s) , decompression: 3.13s (huffman: 3.10s + Zstd: 0.01s), compression ratio: 219x

Jul 14 '22 23:07 JieyangChen7

@JieyangChen7 Thanks for all the help.

There is one minor change needed. Since the new compress API returns mgard::MemoryBuffer, which requires me to indirectly include "utilities.hpp", it triggers the NVCC bug. So, I added #ifndef __NVCC__ around the member functions related to mgard::CartesianProduct to make it work. Do you think this is fine?

Yes, I think that's fine. I'd even put all of CartesianProduct and CartesianProduct::iterator in the #ifndef __NVCC__–#endif block. I think that might make any future compiler errors easier to understand.

Sorry for all the trouble CartesianProduct/the NVCC bug is causing. If this keeps happening, I can split utilities.hpp into utilities/MemoryBuffer.hpp, utilities/CartesianProduct.hpp, etc. Then we could prevent NVCC from ever encountering CartesianProduct.

Those changes are on my local machine. If you are ok with those changes, I can push the changes directly to this huffman-decoding branch or I can let you know what is changed and you make a commit?

Please push them directly to this branch.

I noticed that the new lossless compress/decompress implementation has a lower throughput compared with the old implementation. … Is this performance expected?

Definitely not expected/intended, but I didn't do any performance regression tests, so I'm not shocked to learn that something is slow. I'll look into this next week and try to fix it before asking for this branch to be merged.

Jul 15 '22 15:07 ben-e-whitney

@ben-e-whitney Thanks for the reply.

Yes, I think that's fine. I'd even put all of CartesianProduct and CartesianProduct::iterator in the #ifndef __NVCC__–#endif block. I think that might make any future compiler errors easier to understand.

I have put both CartesianProduct and CartesianProduct::iterator in the #ifndef __NVCC__ block and pushed the changes to this branch. It compiles fine on both my machine and on GitHub. It can also pass all tests of MGARD-X on my machine using the new CPU lossless compressor. But it is reporting failures for some of your tests. For example: /home/runner/work/MGARD/MGARD/tests/src/test_compress.cpp:248: FAILED. Do you know what might be wrong?

Jul 18 '22 15:07 JieyangChen7

@JieyangChen7 I modified include/mgard-x/CompressionHighLevel/Metadata.hpp by replacing std::int64_t with QUANTIZED_INT in a few places in the commit I just pushed (4ef023d0affabc2023ab3c575af9c798055ac8d9). Please give those changes a quick look.

I'm still working on the performance regression and the test failures.

Jul 19 '22 16:07 ben-e-whitney