`vars()` seems required for `rows_*()` but not other functions
Prework
- [x] Read and agree to the code of conduct and contributing guidelines.
- [x] If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
- [x] Post a minimal reproducible example so the maintainer can troubleshoot the problems you identify. A reproducible example is:
- [x] Runnable: post enough R code and data so any onlooker can create the error on their own computer.
- [x] Minimal: reduce runtime wherever possible and remove complicated details that are irrelevant to the issue at hand.
- [x] Readable: format your code according to the tidyverse style guide.
Description
Not entirely sure this is a bug, but it is inconsistent. The examples for pointblank use vars() to select columns, but it seems like c() works identically for most functions except rows_distinct() and rows_complete().
Reproducible example
library(pointblank)
agent <-
create_agent(
tbl = small_table
)
x <- agent %>%
col_vals_lt(vars(a), value = 10)
x <- agent %>%
col_vals_lt(c(a), value = 10)
x <- agent %>%
rows_distinct(vars(a))
x <- agent %>%
rows_distinct(c(a))
#> Error in rlang::eval_tidy(columns): object 'a' not found
x <- agent %>%
rows_complete(vars(a))
x <- agent %>%
rows_complete(c(a))
#> Error in rlang::eval_tidy(columns): object 'a' not found
Created on 2022-06-16 by the reprex package (v2.0.1)
Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.2.0 (2022-04-22)
#> os macOS Big Sur/Monterey 10.16
#> system x86_64, darwin17.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/New_York
#> date 2022-06-16
#> pandoc 2.17.1.1 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.2.0)
#> blastula 0.3.2 2020-05-19 [1] CRAN (R 4.2.0)
#> cli 3.3.0 2022-04-25 [1] CRAN (R 4.2.0)
#> crayon 1.5.1 2022-03-26 [1] CRAN (R 4.2.0)
#> DBI 1.1.2 2021-12-20 [1] CRAN (R 4.2.0)
#> digest 0.6.29 2021-12-01 [1] CRAN (R 4.2.0)
#> dplyr 1.0.9 2022-04-28 [1] CRAN (R 4.2.0)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0)
#> evaluate 0.15 2022-02-18 [1] CRAN (R 4.2.0)
#> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.0)
#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.0)
#> fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.0)
#> generics 0.1.2 2022-01-31 [1] CRAN (R 4.2.0)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0)
#> highr 0.9 2021-04-16 [1] CRAN (R 4.2.0)
#> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.2.0)
#> knitr 1.39 2022-04-26 [1] CRAN (R 4.2.0)
#> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.2.0)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0)
#> pillar 1.7.0 2022-02-01 [1] CRAN (R 4.2.0)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0)
#> pointblank * 0.10.0.9000 2022-06-09 [1] Github (rich-iannone/pointblank@2cf8431)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.2.0)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0)
#> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.2.0)
#> rlang 1.0.2 2022-03-04 [1] CRAN (R 4.2.0)
#> rmarkdown 2.14 2022-04-25 [1] CRAN (R 4.2.0)
#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.2.0)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0)
#> stringi 1.7.6 2021-11-29 [1] CRAN (R 4.2.0)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.2.0)
#> tibble 3.1.7 2022-05-03 [1] CRAN (R 4.2.0)
#> tidyselect 1.1.2 2022-02-21 [1] CRAN (R 4.2.0)
#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.0)
#> vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.2.0)
#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0)
#> xfun 0.31 2022-05-10 [1] CRAN (R 4.2.0)
#> yaml 2.3.5 2022-02-21 [1] CRAN (R 4.2.0)
#>
#> [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
#>
#> ──────────────────────────────────────────────────────────────────────────────
Expected result
I would have expected either vars() or c() to work in the columns argument for rows_distinct() as either one works for many other functions in pointblank.
maybe related to #414
Thanks for submitting this issue and the previous one! Agree that it’s inconsistent and I should be able to address this soon.
I think I noticed that the segments argument also requires vars() and doesn't work with c()
Thanks @Aariq , I made a new issue to track this. Eventually I do want to get all these column/value inconsistencies fixed up.
Maybe off topic a bit but the place where things are non-ideal is in the value argument in many of the validation functions. It can take either literal values or even columns (where vars() might be best). I have to think a bit more on the UX implications of this doing further work on that, but perhaps there is nothing else that could be done with that particular arg without breaking a lot.
Since vars() is flagged as superseeded in the linked dplyr documentation, maybe better break it now than later.
For all expectations culumn only works with tidyselection helpers or column names, whereas value gives an error when using tidyselect ("must be used within a selecting function") and only works with vars().
Thanks for the patience! This is now supported in devel. columns = vars(a) and columns = c(a) are identical sans the captured user expression:
agent <-
create_agent(
tbl = small_table
)
x1 <- agent %>%
rows_distinct(vars(a))
x2 <- agent %>%
rows_distinct(c(a))
waldo::compare(x1, x2)
#> `old$validation_set$columns_expr`: "vars(a)"
#> `new$validation_set$columns_expr`: "c(a)"
x3 <- agent %>%
rows_complete(vars(a))
x4 <- agent %>%
rows_complete(c(a))
waldo::compare(x3, x4)
#> `old$validation_set$columns_expr`: "vars(a)"
#> `new$validation_set$columns_expr`: "c(a)"
We now encourage using c() instead of vars() for column selection (like dplyr::select()). Please file an issue if any vars()->c() conversion breaks your existing code!