compass icon indicating copy to clipboard operation
compass copied to clipboard

[Question]: CpuWasteDetector bug verify

Open ketingli1 opened this issue 2 years ago • 1 comments

spark 的CpuWasteDetector检测器,在判断executorWastedPercentOverAll的时候: float executorWastedPercentOverAll = (((float) inJobComputeMillisAvailable - inJobComputeMillisUsed) / appComputeMillisAvailable) * 100;

其中inJobComputeMillisAvailable的单位是cpu·时,而inJobComputeMillisUsed是SparkListenerTaskEnd这个事件的TaskMetrics @JsonProperty("Executor Run Time") private Long executorRunTime; 这个属性累计的,这个单位是时。

是不是应该取executor的cpu时间字段进行累加? @JsonProperty("Executor CPU Time") private Long executorCpuTime;

ketingli1 avatar Aug 15 '23 10:08 ketingli1

spark 的CpuWasteDetector检测器,在判断executorWastedPercentOverAll的时候: float executorWastedPercentOverAll = (((float) inJobComputeMillisAvailable - inJobComputeMillisUsed) / appComputeMillisAvailable) * 100;

其中inJobComputeMillisAvailable的单位是cpu·时,而inJobComputeMillisUsed是SparkListenerTaskEnd这个事件的TaskMetrics @JsonProperty("Executor Run Time") private Long executorRunTime; 这个属性累计的,这个单位是时。

是不是应该取executor的cpu时间字段进行累加? @JsonProperty("Executor CPU Time") private Long executorCpuTime;

Thanks for your feedback. As we can read the document https://spark.apache.org/docs/latest/monitoring.html , there is no too much difference between executorRunTime and executorCpuTime. But we read the source code of Spark, https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L608-L610 https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L655-L656 the metric executorCpuTime is limited by threadMXBean.getCurrentThreadCpuTime, we can discovery function https://docs.oracle.com/javase/8/docs/api/java/lang/management/ThreadMXBean.html#getCurrentThreadCpuTime--, so the metric executorCpuTime includes either user CPU time or system CPU time, but the executorRunTime will include the whole time when executor running, and executorCpuTime does not include thread context switching time. ExecutorCpuTime is less than excutorRunTime.

zebozhuang avatar Aug 24 '23 06:08 zebozhuang