mutation-testing-elements icon indicating copy to clipboard operation
mutation-testing-elements copied to clipboard

New report terrible for skimming

Open commonquail opened this issue 4 years ago • 14 comments

The new report from 1.2.0, stryker-mutator/stryker-net#1818, is absolutely terrible for skimming. Yes, I can use the arrow buttons to navigate. No, that's not easier either. Neither does toggling visibility of killed mutants help because, in addition to being cumbersome to have to do all the time, it combines poorly with survived mutants:

image

image

I've downgraded to 1.1.0. Nothing else in 1.2.x is worth not being able to see what I'm doing.

commonquail avatar Dec 15 '21 12:12 commonquail

This new design is not .net specific. I'll move this issue to our shared repo for the HTML report

richardwerkman avatar Dec 15 '21 14:12 richardwerkman

What specific part of the new design do you not like? Are the buttons too small to click? I doubt we will completely go back to the old design. Maybe we can improve the new one?

richardwerkman avatar Dec 15 '21 14:12 richardwerkman

Thanks for the feedback @commonquail . I'm responsible for the new underlining, so sorry about that. I did have the support of others as well 😅

absolutely terrible for skimming

What do you mean by "skimming"?

I didn't mind the combination of underlining with survived + killed myself. But if more people feel that it makes it hard to read, we could easily remove the underline of the killed mutants there.

I also want to know exactly what suggestions you would have to make things better. The background color we used before worked, but it was hard to combine with the new diff mode, where developers also expect it to work with background colors. Please take the diff-view into account in your improvement suggestions.

nicojs avatar Dec 15 '21 14:12 nicojs

What do you mean by "skimming"?

https://en.wikipedia.org/wiki/Skimming_(reading)

The old report allowed me to merely look at it and see which regions required attention. The new report requires me to study it rather more carefully to find what requires attention, or to specifically navigate with the controls. While I appreciate that the controls are there, they are less efficient than simply using my eyes would have been; but now my eyes are ineffective. Comparing two reports for the same file is similarly more demanding now.

I didn't mind the combination of underlining with survived + killed myself.

The combination only matters insofar as displaying killed mutants could have been an alternative with which to partially restore said property. I don't really care about the combination and I would not normally display killed mutants anyway. It's not an issue but it's also not a solution.

The background color we used before worked, but it was hard to combine with the new diff mode, where developers also expect it to work with background colors.

Understandably. It did take me a few tries to learn how to read the old report's mutations, though now that I'm comfortable with it I think the diff-view is more different than that it is more effective.

I, too, ascribe the difference to the removal of background colors, in particular the distinction between "killed" and "survived". PIT also features background colors and I used it in the same manner (I forget how it displays uncovered code). With that in mind:

  • Different color sets, like in git diff --color-moved?
  • Different grayscale background colors?
  • Colored gutters?
  • Just one color, for "survived" (that's the signal)?

(Mind that I mostly build invisible things.)

For the record, I have no color vision deficiency.

commonquail avatar Dec 15 '21 15:12 commonquail

So if I understand correctly the survived mutants don't stand out enough? And applying a background colour to survived and no coverage mutants might help? @nicojs I think we could add that right?

richardwerkman avatar Dec 16 '21 10:12 richardwerkman

So if I understand correctly the survived mutants don't stand out enough?

Yep, that's an accurate summary.

commonquail avatar Dec 16 '21 15:12 commonquail

Wouldn't the idea from #1475 (or a minimap) cover skimming best?

Regarding the current report. wouldn't skimming be simpler if there simply was no underlining or highlighting as suggested in #1636?

feckertson avatar Jan 16 '22 12:01 feckertson

[minimap]

I see the potential but I don't know. If somebody makes it I'll happily test drive it but I don't really use minimaps for code editing, myself.

wouldn't skimming be simpler if there simply was no underlining or highlighting as suggested in #1636?

Removing the underlining would make almost no difference. My issue is not that I find the squiggles distracting so much as that I find them virtually invisible. The discs, further, disappear on the right-hand side and are not anchored. I can see how removing the underlining would make for a cleaner version of the diff view, however, to me the diff view itself is a regression, not an improvement.

Restoring the old-style block markers would help some. They have a much bigger clickable surface area than the discs and are anchored to the mutated statement instead of floating off into space. The anchoring offsets the radically decreased visibility of the mutated statement inside the diff view compared to the old report. This, without any passive background color, might well be sufficient to restore the overview I've lost.

Right now it is substantially harder to locate general regions of interest and the exact problematic statements. It has more to do with locality of reference than with minimalism versus whatever is the opposite of minimalism. To that end, the anchoring is so effective that the mere presence of an indicator is at times enough to signal the mutation without even expanding the mutation. With unanchored markers, that falls apart almost immediately.

Being unable to display more than a single mutation in a file at once has been surprisingly limiting, much more so than I anticipated.

Whether the indicators are round or square strikes me as irrelevant but the new ones are too small. I see no benefit to introducing multiple styles of indicators and a few downsides, including having to teach new users to disregard that noisy detail.

I don't care about the mutation score. The only thing I ever use that for is to communicate deltas. It is a completely unactionable piece of information. There is no alternative representation of it that changes this.

I do not think the new diff view is easier to read in general, even as it is in some of the sample screenshots spread around tickets. It benefits from knowledge transfer, which is likely to apply to a lot of users, in a way the old report did not, and it leaves no room for doubt about what code ended up running. In return, it's a lot harder to isolate the actual mutation in a complex statement inside the diff lines. In comparison, the old report directly contrasted the mutant and the mutated statement. Although both reports do less than stellar with protanomaly and deuteranomaly, the red-green color with subtle hue differences and the loss of locality makes the diff view objectively and materially worse. It seems to me a neat technical solution focused on the wrong problem, its appeal owing more to a false sense of familiarity than the efficacy with which it conveys its information. It's a lot more elegant, sure -- it just doesn't help me nearly as well.

commonquail avatar Jan 16 '22 18:01 commonquail

I do like the original markers without background highlighting like in the third example from #1636. They convey vastly more information than the new markers possibly can.

How about coupling the original markers with a diff view like the following ?

image image

It is difficult to make sense from simultaneously expanded, overlapping mutations. Whether it makes sense to view non-overlapping mutations simultaneously is debatable since the system only applies one mutation at a time. The need to expand multiples is also questionable if there is barely any need to expand even one:

the anchoring is so effective that the mere presence of an indicator is at times enough to signal the mutation without even expanding the mutation.

feckertson avatar Jan 17 '22 11:01 feckertson

like in the third example from #1636.

I meant to reference that but forgot. I agree.

How about coupling the original markers with a diff view like the following ?

That looks like status quo...?

It is difficult to make sense from simultaneously expanded, overlapping mutations. Whether it makes sense to view non-overlapping mutations simultaneously is debatable since the system only applies one mutation at a time.

It is not difficult for me. Being able to view multiple mutations at a time enables me to do more work between executions, which saves time even when narrowing the mutated targets.

The need to expand multiples is also questionable if there is barely any need to expand even one:

There sometimes isn't, because you learn to recognize which mutations appear where, or can infer from context. For example, all the displayHTML.push() lines in the third example have the same mutation. The old report suggests it's probably passing something like null or "" to push whereas the new report includes the possibility of eliding the statement altogether. On the other hand, mutations 16-18 are not clear, though my guess is they relate to comparison.

commonquail avatar Jan 17 '22 17:01 commonquail

That looks like status quo...?

Pretend the new indicators are not there and instead that the original indicators are embedded in text that is lined out with the selected indicator highlighted in blue.

I created my examples by tweaking an html that was actually generated and I wouldn't care to tweak an original html to get a difference that shows up as if you crossed out one line (or block) and added a replacement directly below it. The replacement text should be highlighted with baby blue to match the selected indicator rather than the dark color, I suppose.

This presumes only one mutation is selected. If that is what you meant by "status quo", then yes, it is.

feckertson avatar Jan 17 '22 23:01 feckertson

If that is what you meant by "status quo", then yes, it is.

I misread. I thought you were using that picture to demonstrate the "old markers with diff view" but the picture is a demonstration of an alternative diff view.

I don't think I am competent to remark on alternative forms of the diff view. To me, the line based diff view is just incompatible with ~branch based mutations. Relying on different hues instead of saturation for highlights would probably help, though.

commonquail avatar Jan 18 '22 07:01 commonquail

It cannot be incompatible unless the diff view from 5.0.0+ already is because it is the exact same diff view from 5.0.0+ with the background color removed and the font-style switched to line-through.

feckertson avatar Jan 18 '22 08:01 feckertson

Yes, I think the whole diff view is fundamentally a regression. Not so bad I can't work with it but so bad that I choose not to as long as I can forgo other improvements in the driver. For now, Stryker.NET 1.1.x is simply a much more effective tool for me than Stryker.NET 1.2.x is.

commonquail avatar Jan 18 '22 09:01 commonquail