`.path_to_pos()` fails when `path` contains regular expression symbols
The code at https://github.com/insightsengineering/rtables/blob/main/R/tt_pos_and_access.R#L995 can't match paths that contain regular expression symbols like +. For example:
pathregex <- "gear + carb"
pathstrs <- c("gear + carb`3 + 1", "gear + carb`3 + 2")
grep(pathregex, pathstrs)
# integer(0)
Adding fixed = TRUE to grep() seems a solution, but it would affect the function of character *.
See also https://github.com/johnsonandjohnson/junco/issues/100
@gmbecker Can we just escape "+" inside escape_name_padding i.e.
ret <- gsub("+", "\\+", ret, fixed = TRUE)
Running the code snippet below (no dependency on junco) without the change gives me:
But with the change gives me no errors.
library(rtables)
library(dplyr)
new_scorefun <- function(colpath) {
paths <- NULL
function(tt) {
paths <<- col_paths(tt)
unlist(cell_values(tt, colpath = colpath), use.names = FALSE)[1]
}
}
data <- mtcars
data <- data %>%
mutate(`gear + carb` = paste(gear, "+", carb))
data$cyl <- as.factor(data$cyl)
data$hp <- as.factor(data$hp)
data$`gear + carb` <- as.factor(data$`gear + carb`)
lyt <- rtables::basic_table(show_colcounts = TRUE) %>%
rtables::split_cols_by("gear + carb") %>%
rtables::add_overall_col("All Cars") %>%
rtables::analyze("hp")
table <- rtables::build_table(lyt = lyt, df = data)
sorted_table <- rtables::sort_at_path(table,
c("root", "hp"),
scorefun = new_scorefun(colpath = c("gear + carb"))
)
Hmm, escape_name_padding is intended to cover names that are modified in the sibling-name uniqification process. I don't recall if it is even called on names that don't require uniqification (and I'm not fully convinced it should be).
Also even if it is applied universally, accounting for "+" would just fix this specific issue, it would still break for any other symbol that is "regex-active" (other than [ and ], which it already escapes). And once people start creating syntactically invalid column names, who knows what they'll eventually want to put in there, so we'd need to escape every regex-active character to be completely safe, which I'm not super keen on doing.
Thanks for the info and context 👍