client-java icon indicating copy to clipboard operation
client-java copied to clipboard

HA issue about the PDClient

Open iosmanthus opened this issue 3 years ago • 1 comments

Bug Report

1. Describe the bug

While disabling the region cache and killing the PD leader, the client might be unavailable because of the wrong probe logic.

If the following code produces an exception like "retry is exhausted", the rest of the PD server will not be probed. https://github.com/tikv/client-java/blob/1b5edcd8ab8ee12e4dbb84e7e4c008075d473a7a/src/main/java/org/tikv/common/PDClient.java#L549

Another issue is, that the client becomes unavailable while the PD leader is down and encounters the TsoBatchUsedUp since the writer needs to acquire TS from TSO in TiKV 6.2.0. While handing TsoBatchUsedUp, the region cache should not be clean since the region is unavailable right not doesn't means it's not a leader.

2. Minimal reproduce step (Required)

  1. Create a 3-PD cluster and disable region cache every time after the request.
  2. Kill the PD leader.
  3. The client hangs.

3. What did you see instead (Required)

  1. The client will recover after the PD leader is elected.

5. What are your Java Client and TiKV versions? (Required)

  • Client Java: master
  • TiKV: v6.2.0

iosmanthus avatar Sep 02 '22 15:09 iosmanthus

This issue is stale because it has been open 30 days with no activity.

github-actions[bot] avatar Oct 06 '22 01:10 github-actions[bot]