ozone icon indicating copy to clipboard operation
ozone copied to clipboard

HDDS-7137. Add CLI for Getting the failed deleted block txn

Open Xushaohong opened this issue 3 years ago • 9 comments

What changes were proposed in this pull request?

Supplement CLI for deleted block txn.

  1. Integrate the related CLI
  2. Add new CLI to list txns
  3. Support JSON format file as output and input
  4. Improve UT

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7137

How was this patch tested?

UT WX20220818-165155 WX20220818-165230 WX20220818-165132 WX20220818-164911

Xushaohong avatar Aug 18 '22 08:08 Xushaohong

Moving the DeletedBlocksTransaction to common hdds.proto is an ideal solution, but it will lead to the compatibility problem, so I adds the import of ScmServerDatanodeHeartbeatProtocol.proto.

Xushaohong avatar Aug 18 '22 11:08 Xushaohong

@adoroszlai @ChenSammi PTAL~

Xushaohong avatar Aug 22 '22 09:08 Xushaohong

Hey @Xushaohong , could you past some example output of this command? For both console output and file output.

ChenSammi avatar Aug 22 '22 14:08 ChenSammi

Hey @Xushaohong , could you past some example output of this command? For both console output and file output.

@ChenSammi Sure, I screenshotted three pics, and the result of reset is not actually worked due to the non-instant flush as you know

show3 show2 show1

Xushaohong avatar Aug 24 '22 11:08 Xushaohong

Moving the DeletedBlocksTransaction to common hdds.proto is an ideal solution, but it will lead to the compatibility problem, so I adds the import of ScmServerDatanodeHeartbeatProtocol.proto

I don't think we want this dependency chain. hdds.proto exists to prevent the admin and heartbeat proto from depending on each other. Cross client and rolling upgrade compatability (future work) will be harder to maintain with this dependency. The DeletedBlocksTransaction message is not very complicated. Could we make a different version of that message to be used in the client response placed in ScmAdminProtocol.proto?

errose28 avatar Aug 27 '22 00:08 errose28

Moving the DeletedBlocksTransaction to common hdds.proto is an ideal solution, but it will lead to the compatibility problem, so I adds the import of ScmServerDatanodeHeartbeatProtocol.proto

I don't think we want this dependency chain. hdds.proto exists to prevent the admin and heartbeat proto from depending on each other. Cross client and rolling upgrade compatability (future work) will be harder to maintain with this dependency. The DeletedBlocksTransaction message is not very complicated. Could we make a different version of that message to be used in the client response placed in ScmAdminProtocol.proto?

The problem mentioned is right, as the migration of proto would bring the compatibility issue, I just considered the reuse form. I have now added the DeletedBlocksTransactionInfo

Xushaohong avatar Aug 29 '22 12:08 Xushaohong

Thanks for the proto update @Xushaohong. I probably won't have time to review the rest of this PR but that part looks good. Just a question: I see the new proto is in hdds.proto. Would it make more sense to be in SCMAdminProtocol.proto since it is used as the response to an SCM admin command?

errose28 avatar Aug 29 '22 18:08 errose28

Thanks for the proto update @Xushaohong. I probably won't have time to review the rest of this PR but that part looks good. Just a question: I see the new proto is in hdds.proto. Would it make more sense to be in SCMAdminProtocol.proto since it is used as the response to an SCM admin command?

Hi @errose28, hdds.proto is a common client proto, my motivation for putting the definition here has two reasons:

  1. The DeletedBlocksTransactionInfo could be used for the future client request, it is a prospective placement. Refer to the proto of ContainerInfoProto and Pipeline.
  2. I think the placement in ScmServerDatanodeHeartbeatProtocol.proto is a historical problem. Originally it is only considered to be used in HB, not in client queries. Such common proto shall be unified and could help reduce confusion for the user and reader. So I wonder if someday we change the reference of this TXN proto from the ScmServerDatanodeHeartbeatProtocol.proto to the hdds.proto. And we could deprecate the old reference.

Pls feel free to correct me if i am wrong~

Xushaohong avatar Aug 30 '22 02:08 Xushaohong

I think the placement in ScmServerDatanodeHeartbeatProtocol.proto is a historical problem. Originally it is only considered to be used in HB, not in client queries. Such common proto shall be unified and could help reduce confusion for the user and reader. So I wonder if someday we change the reference of this TXN proto from the ScmServerDatanodeHeartbeatProtocol.proto to the hdds.proto. And we could deprecate the old reference.

I think this is a good point. If we put the new message in ScmAdminProtocol.proto we will have the two messages forever since we need to support all old clients. If we put the new message in hdds.proto then later we could switch datanode and SCM communication to use the new message in hdds.proto instead of the old one in ScmServerDatanodeHeartbeatProtocol.proto since we do not support rolling upgrades yet.

errose28 avatar Sep 01 '22 21:09 errose28

@errose28 Can this be merged? I see the changes to move to ScmAdminProtocol has been made.

swamirishi avatar Jan 27 '23 18:01 swamirishi

@Xushaohong can you please rebase and resolve conflicts?

kerneltime avatar Feb 06 '23 06:02 kerneltime

@Xushaohong can you please rebase and resolve conflicts?

@kerneltime done

Xushaohong avatar Feb 07 '23 02:02 Xushaohong

@Xushaohong this looks like a pretty straightforward change, please address the minor nits.

kerneltime avatar Feb 07 '23 02:02 kerneltime

image image The latest usage is as above

Xushaohong avatar Mar 02 '23 12:03 Xushaohong

The last patch LGTM + 1. Thanks @Xushaohong for the contribution and @errose28 @swamirishi @kerneltime for the code review.

ChenSammi avatar Mar 06 '23 08:03 ChenSammi