Fix the two piecewise-linear regression calculation
https://github.com/WebKit/MotionMark/issues/66
The current implementation of the regression calculation has these flaws:
When processing (x[0], y[0]), L1 must be any line through (x[0], y[0]) which meets L2 at a point (x’, y’) where x[0] < x' < x[1]. L1 has no error.
When processing (x[n - 2], y[n - 2]), L2 must be any line through (x[n - 1], y[n - 1]) which meets L1 at a point (x’, y’) where x[n - 2] < x' < x[n - 1]. L2 has no error.
The lambda calculation is incorrect. It includes a term called H which is equal to C - I. Looking at the algorithm of Kundu/Ubhaya, this should be just C.
lambda should to be used with calculating L1 and (1 - lambda) should to be used with calculating L2. Currently (1 - lambda) is used in calculating L1 and L2.
The current calculation has this condition if (t1 != t2) continue; This condition is almost always true even if t1 and t2 are essentiallyEqual.
Said, I'm hoping Johnny and/or Victor will review this. If they are unable, I will.
Could we add a unit test for the old scoring algorithm, and then this PR should modify that test. I'd also like to see an independent implementation of the scoring (either in code, or in Numbers) to validate the algorithm.
Hi Johnny,
Thanks for your review.
I have not done extensive comparisons between the old and the new calculations. Our A/B system can't compare MotionMark changes.
But I did compare JSON's of some runs with and without this change. If you open the developer.html page and drag and drop the JSON file on the button "Drop results here", MotionMark will calculate the scores without running the tests.
I am attaching two JSON files for two MotionMark runs on two different devices. I am attaching screenshots, when I dropped the JSON files locally and when I drop them in https://browserbench.org/MotionMark1.3.1/developer.html.
As you can see from the screenshots there is a slight difference between the two calculations.
Hey Said,
First off, apologies it took a while to look into this again. I was able to reproduce the numbers you show in the screenshots. Did you have any thoughts on the Design test results in your first screenshot, where the 80% Confidence Interval is "0.00 - 547.00 -100.00% +17.20% ". The graph for that one does not look like the algorithm did a good job fitting the lines to the data points, though the score is arguably close to correct? Here's what I see in the graph:
Hi Johnny,
The large confidence interval with the new rewrite happens because the optimal complexity is not clipped as before. This part in the original code removes the extreme results in the histogram you posted above (the ones which are close to zero and the ones around 550).
if (!options.shouldClip || (x_prime >= lowComplexity && x_prime <= highComplexity))
x_best = x_prime;
else {
// Discontinuous piecewise regression
x_best = x;
}
Please notice that options.shouldClip was always set to true. I can restore this clipping back.