HDDS-4239. [Design] Ozone support truncate operation
What changes were proposed in this pull request?
This is a design doc, which is moved from google doc to make it easier to track the progress.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-4239.
Hi @maobaolong , Before do the final merge, can we bring this design in ozone dev mailing list and let more people can look into this? I don't think other people aware this now.
@linyiqun Thanks for suggestions, I have send this design in ozone dev mailing list about 50 days. I also sent it in the slack channel. So I think it's time to merge it, otherwise we can not push it forward.

@linyiqun Thank you for remind me, but this design doc has been sent to ozone-dev when 2020-9-21, and we have discussed this design in the ozone community sync meeting, and no objection for this design, we need to merge this design doc before further coding work, now this PR block our schedule for more than a month, to respect your suggestion, and double check to make sure there are no objection opinion, we can discuss this design doc in the next ozone community sync meeting, everyone are welcome.
The meeting will start at every Friday 12:15 PM (time zone +8, Beijing).
Okay, maybe I missed that.
@linyiqun Would you like to join tomorrow's Ozone sync meeting?
@xiaoyuyao Would you like to review this PR again ? So that this PR can be merged before next week. Thanks a lot.
How about supporting truncate only at block level granularity i.e, If a key needs to be truncated, change the lengths of the partial block to be truncated in OM and all the other blocks which need to be deleted fully can just be deleted . These blocks can then by picked by KeyDeleting service in OM and will get deleted by Delete Workflow in ozone?
@bshashikant +1, thanks the suggestion. With this suggestion, for the partially truncate block, we only need to change the block length in OM, and do not need to change SCM and Datanode. the work become easy. The only cons is we waste maxum 128MB (size of block) of disk of each truncate operation, because we do not process the partially truncate block. But because truncate happen sparsely, so it's not a problem. If we want to process the partially truncate block to save disk capacity, we can still do it in future, it's compatible with current design. @GlenGeng @xiaoyuyao What do you think ?
Yeah, this is quite a straightforward solution. +1
In additional, one another question: As we know ReplicationManager will also do the replication for the closed containers(blocks), but now truncate operation can update the content for this state containers. So how do we do the consistency control between this? If we are doing truncate operation for one key while this key block container is in replication, there is a large possibility this truncate behavior will be failed.
@linyiqun Thanks for review. In this design, we only delete the fully truncated block and change the length of partially truncated block in OM, and delete the fully truncated block file in datanode. In OM, HA response for consistency. In datanode, delete block has already implemented for delete key, we can use this logic to keep consistency. For partially truncated block, I do not plan to truncate the block file in datanode, will change the design for this. Actually, we only need to change OM, do not need to care about consistency.
@runzhiwang @ChenSammi I will close this PR if it is not being worked on.
Closing this PR for now, we can reopen it when we want to.