plot Revisit facets

demo for this PR: https://observablehq.com/@observablehq/plot-facet-annotations-1041

The changes listed below address the following issues:

Facets are currently exclusive

A single datum can’t currently appear in two different facets.

This results in:

a bug when using transforms on facet: “exclude” (because the indices are not exclusive, and a transform might want to assign two different values to the same index in two different facet)
forbidding generic “filters” that define what data goes into which facet (beyond the default, equivalent to the “eq” filter, and “exclude”, equivalent to the “ne” filter); the typical use case is cumulative faceting, with an “lte” filter.

[x] ~~Change 1: fix the bug by expanding channels as necessary (commit b3cdc4a22ab6a53f87591a4aaa916d0de3ed0d45)~~ This part is now superseded by ~~#1057~~ #1068, #1069 or #1070.
[x] Change 2: introduce facet filters (commit bf8fb3f4c5911bd97bb3d39e94545939de5fe4d4)

new named filters to introduce: “eq”, “lt”, “lte”, “gt”, “gte”, as well as a generic filter function that receives as inputs the channel value and the facet value.
Q: should we introduce “ne” (equivalent to “exclude”—except that “exclude” will not impact empty facets)
Q: should we deprecate “include” and “exclude”?

✓ confirmed that bf8fb3f4c5911bd97bb3d39e94545939de5fe4d4 can be cherry-picked on top of #1068 ✓ superseded by #1089

Facets are currently defined globally

Marks can opt in or out of faceting, but currently they have to be aligned on the same facet index (or its complement if using the “exclude” filter). If we allowed to define facets on each mark, we could have two different data with different shapes correctly and logically connected to the facets.

This seems important for time channels but also for trivial things like “adding a specific text on a specific facet”, which is currently too difficult (one has to define channels of the same lengths as the original facet data, and use Plot.selectFirst on them).

[x] Change 3: define facets data from a mark. Needs to read the channel before initialize, and gather it with other marks’ and global facet channels to define the correct index structure. ~~(commit 0c2fadffd17cb77a81de33b1998ec20d2afe8858)~~ extracted to PR #1085.

Note: it makes particular sense to define facet filters on a mark (as in change 2), rather than globally, but we could decide to have both.

Facets are currently limited to x and y

Plan is to extend them to (at least) a time facet.

[x] Change 4: generalize facet names a little. Remove the “1d/2d” code in favor of a more generic n-dimension structure. Reimplement the current optimizations afterwards.

Open questions

API Design

In parallel we need to think about the API design. I’ll start with something and it should be pretty easy to change. Backward compatibility is a hard requirement.

Here are various possibilities:

Plot.dot(data, {facet: {y: “year”, yFilter: “lte”}}) or Plot.dot(data, {facet: {y: {value: “year”, filter: “lte”}}}) or Plot.dot(data, {facet: {time: “year”, timeFilter: “lte”}}) or Plot.dot(data, {time: “year”, timeFilter: “lte”}) // shorthand? original proposal for the time channel Plot.dot(data, {fy: “year”, fyFilter: “lte”}})c// shorthand??

[x] I started with Plot.dot(data, {facet: {y: “year”, yFilter: “lte”}}), but now that I've been using this a bit, I want the API to be flat, like so:

Plot.dot(data, {time: “year”, timeFilter: “lte”})
Plot.dot(data, {fy: “year”, yFilter: “lte”})

The remainder of this process, for time facets and animations, will belong to a new pull request

Variable scales

How do we design a scale that varies with a facet (its domain being driven by the extent of channels in this facet)?

Time facets

Change 5: implement time facet / animation (some draft code in PR #1018).
Change 6: implement (partial) web animations API (some draft code in this notebook).

The time facet is a bit different from x and y, because its output might be an animation inside a rather than a row or column of elements. Also, it makes a natural use of facet filtering such as “lte”, for example to display a connected scatterplot.

Sep 05 '22 15:09 Fil

To consider:

[ ] can we fix an inefficiency where we might compute the facet scales twice? maybe it's inevitable (in the sense that we need the facets before we compute the transforms, but the facet scales can be sorted by the other channels), and the only optimization we can do is avoiding running through the channels twice to find the domain.
[ ] Can we avoid expanding the index if all the transform does is to count/reduce data? Not sure if the transforms could have a flag that says "I accept duplicated indices"?
- [ ] it appears that we could limit the call to this function to the only transform(s?) that need it—like Plot.stack—and decide that it's the responsibility of any custom transform
[ ] Will it work with time facets (I think it's going to be fine, but want to confirm with a workable PR first)

Sep 12 '22 11:09 Fil

~~There is a bug with date facets: the eq filter fails. Fix incoming (hopefully!).~~

Sep 13 '22 16:09 Fil