plot icon indicating copy to clipboard operation
plot copied to clipboard

stack handles negative zero

Open mbostock opened this issue 2 years ago • 10 comments

Ref. https://github.com/observablehq/plot/discussions/1963#discussioncomment-7993585

mbostock avatar Jan 02 '24 18:01 mbostock

🤔 We don't support -0 anywhere else, so this introduces an incongruity. For example, if the value we need to stack is the output of an aggregation with a "sum" reducer, we wouldn't be able to generate -0 (although the whole series might be sums of negative numbers).

As an alternative, I'd suggest adding any zeroes to yn if yp is zero. The patch can't be simpler:

-            else Y2[i] = Y1[i] = yp; // NaN or zero
+            else Y2[i] = Y1[i] = yp || yn; // NaN or zero

Fil avatar Jan 02 '24 20:01 Fil

(Also re-reading this code I wonder why compare is in its own loop, and reverse is not. For readability, could we move compare just above reverse, or vice-versa?)

Fil avatar Jan 02 '24 20:01 Fil

My (very unimaginative) unit test:

export async function stackZeroes() {
  const data = Array.from({length: 100}, d3.randomNormal.source(d3.randomLcg(42))()).map((value, i) => ({
    value: i % 7 === 0 ? 0 : value,
    x: Math.floor(i / 3),
    series: i % 3
  }));
  return Plot.plot({
    marks: [
      Plot.areaY(data, {x: "x", fill: "series", y: (d) => Math.min(d.value, 0)}),
      Plot.areaY(data, {x: "x", fill: "series", y: (d) => Math.max(d.value, 0)})
    ]
  });
}

Fil avatar Jan 02 '24 20:01 Fil

The alternative you suggest is not desirable as it will cause (positive) zero values to swap to the bottom when all the other values are positive, rather than maintaining the input order.

The purpose of this change is simply to allow -0 to indicate a zero value that stacks with the other negative values. We don’t need to do anything else.

mbostock avatar Jan 02 '24 20:01 mbostock

I've spent a moment trying to understand the situation you're describing, but I don't get it. Seems to me that (EDIT: with my suggested patch) zeroes in a stack of positives are properly stacked in input order, as are zeroes in a stack of negatives. This solves the original issue with two separate series min(V,0) and max(V, 0).

The "known unknown" case is when a zero arrives in a stack where we've already seen non-negatives and non-positives. In that case it goes arbitrarily to the top of the positive stack (yp); this might not be desirable and could (additionally) be explicitly routed with -0.

Fil avatar Jan 02 '24 21:01 Fil

@Hvass-Labs says it doesn’t work with 0 as I linked in the OP https://github.com/observablehq/plot/discussions/1963#discussioncomment-7993585. But seems like we need the actual data to confirm that this is the problem and not something else?

mbostock avatar Jan 02 '24 22:01 mbostock

You can see the problem in the linked images:

image

The negative zero values jump up to the y = 0 baseline because they are erroneously considered positive, hence the need to use -1e-35 as the workaround.

mbostock avatar Jan 02 '24 22:01 mbostock

I've edited my comment above to clarify that it relates to my suggested patch.

Fil avatar Jan 02 '24 22:01 Fil

Conceptually, -0 is a negative number and hence should be stacked with the other negative values. This feels more explicit than having the treatment of negative zero depend on whether there’s any preceding positive value in the current stack.

Edit: I’m proposing that we use negative zero to indicate explicitly that you want the zero to stack with the negative values rather than the positive values.

mbostock avatar Jan 02 '24 22:01 mbostock

Agree that if the user passes -0 it is logical to stack it on yn. But should we depend on this peculiar value to answer the OP? It feels like an easter egg. (I understand it's a Number, but it feels like a special symbol that needs specific documentation and is not so easily generated).

I've pushed my suggestion to #1968 — with a funky test case.

But note that we could have both: stack a 0 value on the "active" side (yp || yn), and stack an explicit -0 on the negative side yn:

-            else Y2[i] = Y1[i] = yp; // NaN or zero
+            else Y2[i] = Y1[i] = 1 / y === -Infinity ? yn : yp || yn; // NaN or zero

I don't want to block this further, I just hope I've made my case :) Whatever you prefer, it will be good to include several test plots.

Fil avatar Jan 03 '24 15:01 Fil