Changing defaults for `estimate_relation()`? (and for `data = "grid")`
I think we should change the behaviour of data = "grid" in expect_relation() (and related), and add an option like data = "fullgrid". With this, we could:
# estimate_expectation(data = "fullgrid"), previous behaviour
m <- lm(Sepal.Width ~ Species * Sepal.Length, data = iris)
insight::get_datagrid(m, "all")
#> Species Sepal.Length
#> 1 setosa 4.3
#> 2 setosa 4.7
#> 3 setosa 5.1
#> 4 setosa 5.5
#> 5 versicolor 5.1
#> 6 versicolor 5.5
#> 7 versicolor 5.9
#> 8 versicolor 6.3
#> 9 versicolor 6.7
#> 10 virginica 5.1
#> 11 virginica 5.5
#> 12 virginica 5.9
#> 13 virginica 6.3
#> 14 virginica 6.7
#> 15 virginica 7.1
#> 16 virginica 7.5
#> 17 virginica 7.9
# estimate_expectation(data = "grid") - etimate_relation() should default to this
m <- lm(Sepal.Width ~ Species * Sepal.Length, data = iris)
insight::get_datagrid(m, "all", range = "grid")
#> Species Sepal.Length
#> 1 setosa 5.015267
#> 2 versicolor 5.015267
#> 3 versicolor 5.843333
#> 4 versicolor 6.671399
#> 5 virginica 5.015267
#> 6 virginica 5.843333
#> 7 virginica 6.671399
I think this would be helpful with #201 / #189 and #199 / #145.
However this requires the GitHub version of insight to be on CRAN.
But what about preserve_range
m <- lm(Sepal.Width ~ Species * Sepal.Length, data = iris)
insight::get_datagrid(m)
#> Sepal.Length Species
#> 1 4.3 setosa
#> 2 4.7 setosa
#> 3 5.1 setosa
#> 4 5.5 setosa
#> 5 5.1 versicolor
#> 6 5.5 versicolor
#> 7 5.9 versicolor
#> 8 6.3 versicolor
#> 9 6.7 versicolor
#> 10 5.1 virginica
#> 11 5.5 virginica
#> 12 5.9 virginica
#> 13 6.3 virginica
#> 14 6.7 virginica
#> 15 7.1 virginica
#> 16 7.5 virginica
#> 17 7.9 virginica
insight::get_datagrid(m, preserve_range=FALSE)
#> Sepal.Length Species
#> 1 4.3 setosa
#> 2 4.7 setosa
#> 3 5.1 setosa
#> 4 5.5 setosa
#> 5 5.9 setosa
#> 6 6.3 setosa
#> 7 6.7 setosa
#> 8 7.1 setosa
#> 9 7.5 setosa
#> 10 7.9 setosa
#> 11 4.3 versicolor
#> 12 4.7 versicolor
#> 13 5.1 versicolor
#> 14 5.5 versicolor
#> 15 5.9 versicolor
#> 16 6.3 versicolor
#> 17 6.7 versicolor
#> 18 7.1 versicolor
#> 19 7.5 versicolor
#> 20 7.9 versicolor
#> 21 4.3 virginica
#> 22 4.7 virginica
#> 23 5.1 virginica
#> 24 5.5 virginica
#> 25 5.9 virginica
#> 26 6.3 virginica
#> 27 6.7 virginica
#> 28 7.1 virginica
#> 29 7.5 virginica
#> 30 7.9 virginica
Created on 2022-08-15 by the reprex package (v2.0.1)
But what about preserve_range
What do you mean? That argument still works... I was just thinking about having two options of "grids", and therefore changing the default behaviour.
I think we should change the behaviour of data = "grid" in expect_relation() (and related), and add an option like data = "fullgrid"
I'm not quite sure what would the new behavior would be from the reprex
"fullgrid" will become the old "grid", and "grid" will use less values for numeric variables that are not at the first position. This should address #189
Is this something that could be done by visualization_recipe? Given a grid or data frame, when a variable is in the second position and gets mapped to color, the data is subset to be 3-5 representative values?
(Not sure that's the best idea, but throwing it out there)
I think it would do more harm than good to do another layer of transformation for visualizations, the plot method should do "with what it has" and then users should eventually learn how to get the grid they want to make their plots clearer
visualization_recipe?
I think this is something for visualization_matrix() (resp. get_datagrid()), and it's already implemented in insight. (see very first post at top)