HDDS-10411. Support incremental ChunkBuffer checksum calculation

Open smengcl opened this issue 1 year ago • 0 comments

What changes were proposed in this pull request?

Warning: POC. Unoptimized implementation. Untested. Potentially incorrect impl.

Problem Statement

Currently by default, each 4 MB block chunk is further divided into 16 KB chunks (down from 1 MB, changed in HDDS-10465) for checksum calculation.

The problem is, even with the smaller checksum chunk, clients still calculate the checksum for the whole 4 MB block chunk from the beginning every single time:

https://github.com/apache/ozone/blob/e57370124a36315d2be5791753912901f836ccd8/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/Checksum.java#L171-L177

This PR aims to implement a checksum cache to reduce the CPU time spent in critical section for checksum calculation, in hope of greatly improving client hsync throughput when checksum is enabled.

TODOs

[ ] Add client config key to enable client checksum cache?
[ ] Thoroughly test all code paths using Checksum
[ ] Verify boundary calc in edge cases

Future Work

There are much more improvements that can be done on optimizing checksum, such as:

Even finer-grained incremental checksum calculation.

For CRC32/CRC32C, the checksum can be updated on a byte-by-byte basis (rather having to calculate the entire 16 KB)
For SHA256 and MD5, the checksum can be updated every 64 bytes (512 bits).

Transfer only the unacknowledged checksums to the datanode. Requires proto change.

But those are beyond the scope of this jira and would require major refactoring.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10411

How was this patch tested?

[ ] New UT cases to be added

Sep 11 '24 23:09 smengcl