How to pass variable names to the preconditions argument
I am looking to check the length of strings in a dataset (along the same lines of #140). However, I would like to create a function for this test to reduce repetition in my code (such as when I want to check a large number of variables).
The roadblock I've run into is that I can't get a variable name I pass to the function I created (check_length below) to be used in the preconditions argument. In the reprex below, I create two pointblank agents. The first runs successfully, but the second does not because I use the check_length() function.
Is there a way to pass a variable name to the preconditions argument or a better way to go about this?
library(pointblank)
library(tidyverse)
library(palmerpenguins)
# convert factor columns to character
penguins_char <- penguins %>%
mutate(across(where(is.factor), as.character))
# function to check the length of a variable
check_length <- function(x, variable, length) {
x %>%
col_vals_lte(
vars(z),
value = length,
na_pass = TRUE,
preconditions = ~ . %>% mutate(z = str_length({{variable}}))
)
}
# This works
penguins_char %>%
create_agent(
actions = action_levels(stop_at = 1)
) %>%
# is the island column <= 6 characters long?
col_vals_lte(
vars(z),
value = 6,
na_pass = TRUE,
preconditions = ~ . %>% mutate(z = str_length(island))
) %>%
interrogate()
####### REMOVED HTML output
# This does not work
penguins_char %>%
create_agent(
actions = action_levels(stop_at = 1)
) %>%
# is the island column <= 6 characters long?
check_length(island, 6) %>%
interrogate()
#> Error in (function (arg) : object 'variable' not found
Created on 2021-05-07 by the reprex package (v2.0.0)
I tried making this work but I was unsuccessful. I will try again though (never give up!). Just wanted to let you know that I’m super interested in getting you a solution for this one.
While we could try to make this work with some automagic, my preference would be for the user to simply define the function outside of the preconditions argument (where you have full control over the metaprogramming), and then pass the resulting function object to preconditions.
In the case of the reprex, we can use rlang::inject() to "spell out" the selected column:
library(pointblank)
library(tidyverse)
library(palmerpenguins)
# convert factor columns to character
penguins_char <- penguins %>%
mutate(across(where(is.factor), as.character))
check_length <- function(x, variable, length) {
# Capture the supplied column as symbol and stick it inside the lambda function
variable_sym <- rlang::ensym(variable)
precondition_fn <- rlang::inject(
~ . %>% mutate(z = str_length(!!variable_sym))
)
# The actual vaildation step
x %>%
col_vals_lte(
z,
value = length,
na_pass = TRUE,
preconditions = precondition_fn # pass the function object instead of defining it on the spot here
)
}
agent <- penguins_char %>%
create_agent(
actions = action_levels(stop_at = 1)
) %>%
check_length(island, 6) %>%
interrogate()
#> ~. %>% mutate(z = str_length(island))
#>
#> ── Interrogation Started - there is a single validation step ───────────────────────────────────────
#> ✖ Step 1: STOP condition met.
#>
#> ── Interrogation Completed ─────────────────────────────────────────────────────────────────────────
agent$validation_set$preconditions
#> [[1]]
#> ~. %>% mutate(z = str_length(island))
#> <environment: 0x0000021eb0044268>
agent %>% get_agent_report(display_table = FALSE)
#> # A tibble: 1 × 14
#> i type columns values precon active eval units n_pass f_pass W S N extract
#> <int> <chr> <chr> <chr> <chr> <lgl> <chr> <dbl> <dbl> <dbl> <lgl> <lgl> <lgl> <int>
#> 1 1 col_vals_l… z 6 1 TRUE OK 344 292 0.849 NA TRUE NA 52