WorkloadTools icon indicating copy to clipboard operation
WorkloadTools copied to clipboard

Power BI Report Calculation - possible errors

Open cairneym opened this issue 5 years ago • 6 comments

From my workload, the summary calculations for the averages appear to use the total number of times the query has executed between BOTH the baseline and the benchmark rather than working out the average per workload.

The attached shows the Query Detail from drillthrough for a selected query hash with only a single execution for each of the baseline and benchmark traces, but the values shown in the table in the Queries summary is 50% of what would be expected, suggesting that both entries from each trace are used to calculate the average.

query_summary

query detail

cairneym avatar Apr 09 '20 07:04 cairneym

Hi, Each measure filters only a single workload, therefore this should not happen.

Can you share your pbix file or the data related to that Sql hash so I can have a better look at your issue?

Trovalo avatar Apr 09 '20 09:04 Trovalo

I started digging in further to trace this back to source. The SQLite database DID actually have 2 queries that matched this query text pattern - it just so happened that both of them were executed within 1 second of each other.

The aggregation in the analysis therefore put them in the same bucket and hence the charts "appearing" to only show 1 value while making averages.

A suggestion to make this more obvious would be to include the number of executions over time as an additional chart on the Drillthrough report.

cairneym avatar Apr 13 '20 06:04 cairneym

I'm assuming that the number of executions feeds into the TOTALS calculations as well? If I copy the data from the Table from the "Queries" report tab and paste into Excel, I see what appears to be some rounding differences in the "Total" type columns, but I cannot make the "Average" type columns get even close to the same values. Using the (Total Duration/CPU/Reads)/(Number of Batches) gets closer values but still quite a way out.

Comparison: Averages.pdf

cairneym avatar Apr 13 '20 07:04 cairneym

Glad to hear that you have some additional info. About the first suggestion, the number of executions can be put in the chart itself as a tooltip (it will show when you hover the value).

It's still not easy to check it without the data, I should be able to replicate it just by using the data related to that "sql hash", a raw extraction from the database should be enough, I don't need any dimensional data (no db name , app name, query text, etc), just the metric values and their "sql hash" from the "PowerBI_WorkloadData" view. Can you share those data?

Trovalo avatar Apr 13 '20 08:04 Trovalo

I had a look at some of my data and didn't notice any anomaly between the data and the calculation values.

But I noticed that the averages might be unprecise. As an example, [Average Duration - Baseline] is calculated by computing the average of the column "Avg Duration (µs)" where the data series is "baseline". But that column is already a pre-aggregation of the time interval (ie: 1min) if in the same minute the query runs 10 times then "Avg Duration (µs)" is the average of 10 executions and this is not considered.

I will fix the averages soon, by:

  1. calculating a "weighted average" or
  2. calculating the average using "Sum Duration (µs)" / "Execution Count", which will probably be even more precise since it will have the decimals (we are talking about decimals of µs)

Update: Here you can find the new template. It will be official as soon as the PR is approved WorkloadTools Report - Template.zip

Trovalo avatar Apr 14 '20 09:04 Trovalo

PR approved 😄

spaghettidba avatar Apr 14 '20 17:04 spaghettidba