(en)quo(s) vs (en)sym(s)
Capture the essence of this twitter discussion somewhere:
https://twitter.com/JennyBryan/status/1088859123658018816
Summary: enquo() is a better all-purpose default quotation mechanism to recommend than ensym(). That is, if you're only going to learn 1 of these, make it enquo().
So I think what we thought we were buying with ensym() was some guarantee that the object would actually resolve to a column name or error.
We don't get that with enquo(), so it might also be worth discussing the pattern for getting that back. Unless I've missed it, I don't see an rlang function to test whether a quosure contains something that is a symbol. It's not is_symbol or is_symbolic.
Thinking one step further than this, if such a function does exist, and you were going to recommend it be combined with enquo(), I'd have to ask why not combine the two into one function? So it would be something like enquo_sym() that captures an expression and an environment, and errors if the expression is not a symbol.
So I think what we thought we were buying with
ensym()was some guarantee that the object would actually resolve to a column name or error.We don't get that with
enquo(), so it might also be worth discussing the pattern for getting that back.
Re: getting this guarantee "back". This property isn't born out by the use of either ensym() (or enquo()). But enquo() comes the closest, in the sense of having a more limited scope when the quoted user input is evaluated.
library(tidyverse)
summarise_ensym <- function(.data, summarise_col) {
Spal.Length <- rep_len(0, nrow(iris))
.data %>%
group_by(Species) %>%
summarise(
avg = mean(!!ensym(summarise_col))
)
}
summarise_enquo <- function(.data, summarise_col) {
Spal.Length <- rep_len(0, nrow(iris))
.data %>%
group_by(Species) %>%
summarise(
avg = mean(!!enquo(summarise_col))
)
}
## Same result when all is well, e.g. no unfortunate typos / name collisions
summarise_ensym(iris, Sepal.Length)
#> # A tibble: 3 x 2
#> Species avg
#> <fct> <dbl>
#> 1 setosa 5.01
#> 2 versicolor 5.94
#> 3 virginica 6.59
summarise_enquo(iris, Sepal.Length)
#> # A tibble: 3 x 2
#> Species avg
#> <fct> <dbl>
#> 1 setosa 5.01
#> 2 versicolor 5.94
#> 3 virginica 6.59
## evaluation of the `ensym()`d input happens with execution env in scope
summarise_ensym(iris, Spal.Length)
#> # A tibble: 3 x 2
#> Species avg
#> <fct> <dbl>
#> 1 setosa 0
#> 2 versicolor 0
#> 3 virginica 0
## not so with `enquo()`d input
summarise_enquo(iris, Spal.Length)
#> Error in ~Spal.Length: object 'Spal.Length' not found
## however both can still find the "wrong" object in global env
## although execution env is still consulted first for `ensym()`d user input
Spal.Length <- rep_len(50, nrow(iris))
summarise_ensym(iris, Spal.Length)
#> # A tibble: 3 x 2
#> Species avg
#> <fct> <dbl>
#> 1 setosa 0
#> 2 versicolor 0
#> 3 virginica 0
summarise_enquo(iris, Spal.Length)
#> # A tibble: 3 x 2
#> Species avg
#> <fct> <dbl>
#> 1 setosa 50
#> 2 versicolor 50
#> 3 virginica 50
Ptal.Width <- rep_len(1000, nrow(iris))
summarise_ensym(iris, Ptal.Width)
#> # A tibble: 3 x 2
#> Species avg
#> <fct> <dbl>
#> 1 setosa 1000
#> 2 versicolor 1000
#> 3 virginica 1000
summarise_enquo(iris, Ptal.Width)
#> # A tibble: 3 x 2
#> Species avg
#> <fct> <dbl>
#> 1 setosa 1000
#> 2 versicolor 1000
#> 3 virginica 1000
Created on 2019-01-28 by the reprex package (v0.2.1)
A related situation holds if these 2 functions are defined and exported in a package, i.e. now !!ensym(var) can resolve to something in the namespace environment, but !!enquo(var) does not.
I don't see an rlang function to test whether a quosure contains something that is a symbol
It is quo_is_symbol().
So I think what we thought we were buying with ensym() was some guarantee that the object would actually resolve to a column name or error.
To get this guarantee you can use .data[[mycol]], or go through tidyselect.