plot icon indicating copy to clipboard operation
plot copied to clipboard

Jitter transform?

Open mbostock opened this issue 2 years ago • 9 comments

Especially in conjunction with the dodge transform when the inputs are highly quantized. Another option could be that the dodge transform interprets the input values as a range (with some uncertainty) and finds the position within the range that minimizes the stacked value.

Ref. https://observablehq.com/d/a26e34c1902de423

mbostock avatar May 12 '23 18:05 mbostock

I'm not opposed to the idea, but this can be written as a map: Plot.mapX((v) => v.map((d) => d + random(), { … x: "rating" … }) which might be easier to read and modify and expand upon—even for people who are just a bit familiar with javascript.

I like the second idea better, where the dodge transform would allow some uncertainty to optimize its layout. It would be great to have a less stringent dodge transform, for when the x value is itself just an indication (a sample) of the value, and not an exact value that needs to be perfectly encoded—and we are willing to use this to make a more compact chart.

The best algorithm in that case might be something else than jitter, so it makes more sense to offer this as an option to the dodge transform (such as layout: "jitter", which could be complemented by other algorithms, or with a custom function). For example, why jitter a point that is on its own, if it's optimal—an algorithm that tries 5 positions and retains the best at each step might perform better.

Fil avatar May 12 '23 18:05 Fil

Yes, improving the dodge transform is the better solution for the motivating use case. But jittering is a nice tool to have too.

I agree this could be implemented as a map method, too. I think the main advantage over calling Math.random would be to make it seeded by default so that the layout is deterministic and reproducible.

mbostock avatar May 12 '23 18:05 mbostock

Yes, I've left out the definition of random :-) Some might want a seeded randomNormal instead of uniform.

Fil avatar May 12 '23 18:05 Fil

Something else to consider is what happens when the dodged axis is categorical. In this example, I would like to be able to use a small jitter to prevent the dots from overlapping with other categories

nachocab avatar Aug 11 '24 20:08 nachocab

@nachocab I think in that example you would want jitter along x (the quantitative axis), not along y, since the dodgeY transform is already computing the y offset. For example if you use x: (d) => d["IMDB Rating"] && (d["IMDB Rating"] + Math.random() * 0.1) you get this:

untitled - 2024-08-11T201452 804

mbostock avatar Aug 12 '24 00:08 mbostock

@mbostock Thanks! That helps, but I guess what I'm really after is a way to limit the height of each facet. For example, if there are many dots, jittering on the x can still lead to collisions: image

Basically, just sampling from a uniform distribution like in this example. I think that's what geom_jitter(width=0.1) in ggplot does:

image

nachocab avatar Aug 12 '24 05:08 nachocab

For this particular chart, since the category is given by the facet fy, you can use y to jitter:

Plot.dot(data, { x: "date", fill: "cat", r: 1.5, fillOpacity: 0.5, y: Math.random })

If you also set y: {axis: null} you get:

Capture d’écran 2024-08-12 à 10 28 59

Another approach could be:

Plot.rect(data, Plot.binX({opacity: "count", interval: 1}, { x: "date", fill: "cat", inset:0}))

with

opacity: {range: [0.1, 1], type:"log"},

Capture d’écran 2024-08-12 à 10 31 50

Fil avatar Aug 12 '24 14:08 Fil

Thanks, Phil! I would have never thought of that.

Here is a simplified example where I reduced the size of each jitter band and increased the separation by playing with fy.padding, height and insetTop/insetBottom. I'm not sure if there's a better way. CleanShot 2024-08-13 at 08 19 10@2x

As opposed to the default: CleanShot 2024-08-13 at 08 20 34@2x

nachocab avatar Aug 13 '24 06:08 nachocab

Sure! You could also say y:{ axis: null, domain: [-0.5, 1.5]} (since the random distribution covers [0, 1], this would leave some gaps too). Many options!

Fil avatar Aug 14 '24 16:08 Fil