ozone HDDS-11566. Replication tasks add transferredBytes and queuedTime metrics

What changes were proposed in this pull request?

Added transferredBytes and queuedTime metrics for replication tasks, including ec reconstruction and container replication.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11566

How was this patch tested?

ci: https://github.com/jianghuazhu/ozone/actions/runs/11310889447

datanode jmx:

Oct 13 '24 05:10 jianghuazhu

@sodonnel @errose28 @fapifta , can you help review this PR? Thanks.

Oct 13 '24 05:10 jianghuazhu

@jianghuazhu I have some questions about these metrics. My understanding is that metrics should have practical measurement value. Each machine may receive a large number of EC recover commands and replication commands; how can the bytes transferred help us? Additionally, these commands themselves have a timeout, so they won't execute beyond that time. What is the value of the queueing time?

cc: @whbing @weimingdiit

Oct 13 '24 19:10 slfan1989

My understanding is that metrics should have practical measurement value

@slfan1989 @jianghuazhu I agree with this view. maybe the average waiting time of the queue is a useful metric, because it can tell us the backlog of commands in the queue, but I don't know whether the transferredBytes metric is meaningful for us to observe system performance.

Oct 14 '24 08:10 weimingdiit

We should have some way to approximate how much network traffic is happening due to replication tasks. To really visualize this it would have to go in a dashboard. I think this would look something like this:

Ozone tracks total number of bytes transferred since the last restart
In the dashboard (Grafana for example) we approximate network traffic over time by subtracting the current sampled metric value from the last sampled metric.
- Note that the metric's implementation in Ozone would always append to the counter, so the values would never decrease unless the cluster is restarted.
- The chart would then look like a step function where each plateau shows the amount of data transferred between its start and end times

@kerneltime is this the way such a thing is usually implemented? I'm not sure if we have a standard in other Ozone dashboards for how to chart continuous events like network traffic.

Oct 15 '24 18:10 errose28

Thanks @errose28 for the comment and review. @kerneltime , do you have any new suggestions? Thanks.

Oct 17 '24 03:10 jianghuazhu

@kerneltime @slfan1989 @weimingdiit how should we proceed on this PR?

Dec 11 '24 17:12 adoroszlai

In the dashboard (Grafana for example) we approximate network traffic over time by subtracting the current sampled metric value from the last sampled metric.

Regarding this, I think there are two ways to achieve it:

Now that we have the total number of bytes transmitted, when we want to check the transmission trend, we can record the difference of the data outside the Ozone system. For example, the total number of bytes obtained at 09:10:20 is 300mb, and the total number of bytes obtained at 09:10:30 is 400mb, then the traffic trend during the period of 09:10:20~09:10:30 is 100mb.
A common module can be designed to handle functions similar to traffic transmission trends. It is necessary to implement aggregation according to time periods, for example, 5s, 1min, 1h. Regularly collect the difference between two time periods, 09:10:20 ~ 09:10:25->40mb, 09:10:30 ~ 09:11:30->500mb, 09:10:30 ~ 10:10:30->2gb, these differences should be defined as instantaneous values. In general, their effects should be like this:

@adoroszlai @errose28 @kerneltime , what do you think?

Dec 13 '24 09:12 jianghuazhu

@jianghuazhu Thank for your contribution! However, I still don't fully understand the specific purpose of this metric. In other words, how do the fluctuations of this metric help us assess the state of a DataNode? We have a cluster where 99% of the data uses EC, but based on my maintenance experience, I typically don’t pay much attention to traffic changes caused by reconstruction on a DataNode, because compared to read/write traffic, the impact should be negligible. If the practical use of this metric cannot be clearly explained, I think it would be better not to add it to the system, as it would only increase the complexity of the metric framework.

cc: @errose28 @adoroszlai @weimingdiit

Dec 13 '24 10:12 slfan1989

A common module can be designed to handle functions similar to traffic transmission trends. It is necessary to implement aggregation according to time periods, for example, 5s, 1min, 1h. Regularly collect the difference between two time periods, 09:10:20 ~ 09:10:25->40mb, 09:10:30 ~ 09:11:30->500mb, 09:10:30 ~ 10:10:30->2gb, these differences should be defined as instantaneous values.

This viewpoint is partially agreed upon, as it does reflect that network traffic should indeed resemble the monitoring graph. However, I believe that such a complex statistical module should not be added internally within Ozone. Instead, it could be considered for implementation in an external metric collection system.

Dec 13 '24 11:12 slfan1989

This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days.

Nov 12 '25 00:11 github-actions[bot]

Thank you for your contribution. This PR is being closed due to inactivity. If needed, feel free to reopen it.

Nov 19 '25 00:11 github-actions[bot]