hadoop icon indicating copy to clipboard operation
hadoop copied to clipboard

HDFS-17564. Erasure Coding: Fix the issue of inaccurate metrics when decommission mark busy DN

Open haiyang1987 opened this issue 1 year ago • 4 comments

Description of PR

https://issues.apache.org/jira/browse/HDFS-17564

If DataNode is marked as busy and contains many EC blocks, when running decommission DataNode, when execute ErasureCodingWork#addTaskToDatanode, here will no replication work will be generated for ecBlocksToBeReplicated, but related metrics (such as DatanodeDescriptor#currApproxBlocksScheduled, pendingReconstruction and needReconstruction) will still updated.

Specific code: BlockManager#scheduleReconstruction -> BlockManager#chooseSourceDatanodes [2628~2650] If DataNode is marked as busy and contains many EC blocks here will not add to srcNodes.

@VisibleForTesting
DatanodeDescriptor[] chooseSourceDatanodes(BlockInfo block,
    List<DatanodeDescriptor> containingNodes,
    List<DatanodeStorageInfo> nodesContainingLiveReplicas,
    NumberReplicas numReplicas, List<Byte> liveBlockIndices,
    List<Byte> liveBusyBlockIndices, List<Byte> excludeReconstructed, int priority) {
  containingNodes.clear();
  nodesContainingLiveReplicas.clear();
  List<DatanodeDescriptor> srcNodes = new ArrayList<>();
 ...

  for (DatanodeStorageInfo storage : blocksMap.getStorages(block)) {
    final DatanodeDescriptor node = getDatanodeDescriptorFromStorage(storage);
    final StoredReplicaState state = checkReplicaOnStorage(numReplicas, block,
        storage, corruptReplicas.getNodes(block), false);
    ...
    // for EC here need to make sure the numReplicas replicates state correct
    // because in the scheduleReconstruction it need the numReplicas to check
    // whether need to reconstruct the ec internal block
    byte blockIndex = -1;
    if (isStriped) {
      blockIndex = ((BlockInfoStriped) block)
          .getStorageBlockIndex(storage);
      countLiveAndDecommissioningReplicas(numReplicas, state,
          liveBitSet, decommissioningBitSet, blockIndex);
    }

    if (priority != LowRedundancyBlocks.QUEUE_HIGHEST_PRIORITY
&& (!node.isDecommissionInProgress() && !node.isEnteringMaintenance())
        && node.getNumberOfBlocksToBeReplicated() +
        node.getNumberOfBlocksToBeErasureCoded() >= maxReplicationStreams) {
      if (isStriped && (state == StoredReplicaState.LIVE
|| state == StoredReplicaState.DECOMMISSIONING)) {
        liveBusyBlockIndices.add(blockIndex);
        //HDFS-16566 ExcludeReconstructed won't be reconstructed.
        excludeReconstructed.add(blockIndex);
      }
      continue; // already reached replication limit
    }

    if (node.getNumberOfBlocksToBeReplicated() +
        node.getNumberOfBlocksToBeErasureCoded() >= replicationStreamsHardLimit) {
      if (isStriped && (state == StoredReplicaState.LIVE
|| state == StoredReplicaState.DECOMMISSIONING)) {
        liveBusyBlockIndices.add(blockIndex);
        //HDFS-16566 ExcludeReconstructed won't be reconstructed.
        excludeReconstructed.add(blockIndex);
      }
      continue;
    }

    if(isStriped || srcNodes.isEmpty()) {
      srcNodes.add(node);
      if (isStriped) {
        liveBlockIndices.add(blockIndex);
      }
      continue;
    }
   ...

ErasureCodingWork#addTaskToDatanode[149~157]

@Override
void addTaskToDatanode(NumberReplicas numberReplicas) {
  final DatanodeStorageInfo[] targets = getTargets();
  assert targets.length > 0;
  BlockInfoStriped stripedBlk = (BlockInfoStriped) getBlock();

  ...
  } else if ((numberReplicas.decommissioning() > 0 ||
      numberReplicas.liveEnteringMaintenanceReplicas() > 0) &&
      hasAllInternalBlocks()) {
    List<Integer> leavingServiceSources = findLeavingServiceSources();
    // decommissioningSources.size() should be >= targets.length
    // if the leavingServiceSources size is 0,  here will not to createReplicationWork
    final int num = Math.min(leavingServiceSources.size(), targets.length);
    for (int i = 0; i < num; i++) {
      createReplicationWork(leavingServiceSources.get(i), targets[i]);
    }
  ...
}

// Since there is no decommission busy datanode in srcNodes, here return the set size of srcIndices as 0.
private List<Integer> findLeavingServiceSources() {
    // Mark the block in normal node.
    BlockInfoStriped block = (BlockInfoStriped)getBlock();
    BitSet bitSet = new BitSet(block.getRealTotalBlockNum());
    for (int i = 0; i < getSrcNodes().length; i++) {
      if (getSrcNodes()[i].isInService()) {
        bitSet.set(liveBlockIndices[i]);
      }
    }
    // If the block is on the node which is decommissioning or
    // entering_maintenance, and it doesn't exist on other normal nodes,
    // we just add the node into source list.
    List<Integer> srcIndices = new ArrayList<>();
    for (int i = 0; i < getSrcNodes().length; i++) {
      if ((getSrcNodes()[i].isDecommissionInProgress() ||
          (getSrcNodes()[i].isEnteringMaintenance() &&
          getSrcNodes()[i].isAlive())) &&
          !bitSet.get(liveBlockIndices[i])) {
        srcIndices.add(i);
      }
    }
    return srcIndices;
  }

so we need to fix this logic to avoid inaccurate metrics.

haiyang1987 avatar Jun 30 '24 13:06 haiyang1987

:broken_heart: -1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 0m 0s Docker mode activated.
-1 :x: docker 1m 27s Docker failed to build run-specific yetus/hadoop:tp-7840}.
Subsystem Report/Notes
GITHUB PR https://github.com/apache/hadoop/pull/6911
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6911/1/console
versions git=2.34.1
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus avatar Jun 30 '24 13:06 hadoop-yetus

:broken_heart: -1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 6m 52s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 0s No case conflicting files found.
+0 :ok: codespell 0m 1s codespell was not available.
+0 :ok: detsecrets 0m 1s detect-secrets was not available.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
+1 :green_heart: test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 :green_heart: mvninstall 36m 0s trunk passed
+1 :green_heart: compile 0m 44s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 :green_heart: compile 0m 40s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 :green_heart: checkstyle 0m 41s trunk passed
+1 :green_heart: mvnsite 0m 48s trunk passed
+1 :green_heart: javadoc 0m 43s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 :green_heart: javadoc 1m 8s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 :green_heart: spotbugs 1m 56s trunk passed
+1 :green_heart: shadedclient 22m 14s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 :green_heart: mvninstall 0m 38s the patch passed
+1 :green_heart: compile 0m 37s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 :green_heart: javac 0m 37s the patch passed
+1 :green_heart: compile 0m 35s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 :green_heart: javac 0m 35s the patch passed
+1 :green_heart: blanks 0m 0s The patch has no blanks issues.
+1 :green_heart: checkstyle 0m 29s the patch passed
+1 :green_heart: mvnsite 0m 38s the patch passed
+1 :green_heart: javadoc 0m 31s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 :green_heart: javadoc 1m 2s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 :green_heart: spotbugs 1m 42s the patch passed
+1 :green_heart: shadedclient 22m 36s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 :x: unit 195m 41s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 :green_heart: asflicense 0m 28s The patch does not generate ASF License warnings.
295m 52s
Reason Tests
Failed junit tests hadoop.hdfs.TestDecommissionWithStripedBackoffMonitor
hadoop.hdfs.server.blockmanagement.TestDatanodeManager
hadoop.hdfs.server.blockmanagement.TestBlockManager
hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6911/2/artifact/out/Dockerfile
GITHUB PR https://github.com/apache/hadoop/pull/6911
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux cfa97df1b2c7 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 130d9ae5817fe87e6a1e6150cb7efc23f4ed570a
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6911/2/testReport/
Max. process+thread count 4949 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6911/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus avatar Jul 01 '24 07:07 hadoop-yetus

The failed unit test is not related to this PR.

haiyang1987 avatar Jul 01 '24 11:07 haiyang1987

Hi @Hexiaoqiao @ZanderXu @zhangshuyan0 could you please help me review this pr when you have free time? Thanks~.

haiyang1987 avatar Jul 01 '24 11:07 haiyang1987

Update PR, Hi @Hexiaoqiao help review it again, thanks!

haiyang1987 avatar Jul 04 '24 02:07 haiyang1987

:broken_heart: -1 overall

Vote Subsystem Runtime Logfile Comment
+0 :ok: reexec 14m 11s Docker mode activated.
_ Prechecks _
+1 :green_heart: dupname 0m 0s No case conflicting files found.
+0 :ok: codespell 0m 0s codespell was not available.
+0 :ok: detsecrets 0m 0s detect-secrets was not available.
+1 :green_heart: @author 0m 0s The patch does not contain any @author tags.
+1 :green_heart: test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 :green_heart: mvninstall 45m 5s trunk passed
+1 :green_heart: compile 1m 21s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 :green_heart: compile 1m 18s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 :green_heart: checkstyle 1m 11s trunk passed
+1 :green_heart: mvnsite 1m 25s trunk passed
+1 :green_heart: javadoc 1m 8s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 :green_heart: javadoc 1m 44s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 :green_heart: spotbugs 3m 20s trunk passed
+1 :green_heart: shadedclient 35m 43s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 :green_heart: mvninstall 1m 11s the patch passed
+1 :green_heart: compile 1m 14s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 :green_heart: javac 1m 14s the patch passed
+1 :green_heart: compile 1m 5s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 :green_heart: javac 1m 5s the patch passed
+1 :green_heart: blanks 0m 0s The patch has no blanks issues.
+1 :green_heart: checkstyle 0m 58s the patch passed
+1 :green_heart: mvnsite 1m 12s the patch passed
+1 :green_heart: javadoc 0m 54s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 :green_heart: javadoc 1m 37s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 :green_heart: spotbugs 3m 10s the patch passed
+1 :green_heart: shadedclient 35m 56s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 :x: unit 227m 36s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 :green_heart: asflicense 0m 54s The patch does not generate ASF License warnings.
380m 17s
Reason Tests
Failed junit tests hadoop.hdfs.server.blockmanagement.TestDatanodeManager
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6911/3/artifact/out/Dockerfile
GITHUB PR https://github.com/apache/hadoop/pull/6911
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux ea2255be361b 5.15.0-113-generic #123-Ubuntu SMP Mon Jun 10 08:16:17 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / a4aaa06477dcd8603c70654dd5d63fd566507b40
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6911/3/testReport/
Max. process+thread count 4184 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6911/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus avatar Jul 04 '24 09:07 hadoop-yetus

Will commit if no more other comments while wait one workday.

Hexiaoqiao avatar Jul 04 '24 13:07 Hexiaoqiao

Committed to trunk. Thanks @haiyang1987 . Failed unit tests is not related to this PR.

Hexiaoqiao avatar Jul 05 '24 12:07 Hexiaoqiao

Thanks @Hexiaoqiao for your review and merge it.

haiyang1987 avatar Jul 06 '24 04:07 haiyang1987