parquet-java icon indicating copy to clipboard operation
parquet-java copied to clipboard

PARQUET-2184: Improve the allocation behavior of SnappyCompressor

Open abaranec opened this issue 3 years ago • 3 comments

This PR improves the allocation behavior of SnappyCompressor. Previously when more buffer space was needed, it would only allocate enough for the new data to be written. Now, it will double the internal buffer size up to 8MB, and then afterwards increase size in 1MB increments.

No additional unit tests are added, as the existing unit tests for SnappyCodec and other already verify correctness. I have personally verified the performance gains using JMH benchmarks.

abaranec avatar Sep 04 '22 04:09 abaranec

I wonder how much benefit get can gain of this fix?

shangxinli avatar Sep 27 '22 22:09 shangxinli

@abaranec can you resolve the conflict?

shangxinli avatar Dec 03 '22 18:12 shangxinli

@shangxinli Sorry it took so long to do this. I resolved the conflicts. The changes all essentially moved into NonBlockedCompressor, I also incorporated the two changes you suggested.

One other thing worth discussing, For the first alloc, I'm just using the requested size as the initial. It occurs to me that it might be better to use a little more memory to guarantee that we start with, and continue with an 8-byte aligned buffer size. Maybe starting at 16 or 32 bytes. What do you think?

abaranec avatar Dec 05 '22 14:12 abaranec