totally remove empty subgroup without post trimming/pruning
Hi @gmbecker , see the following scenario:
Create a dummy dataset
data <- data.frame(
subj = paste0("subj", seq(1, 6)),
paramcd = c(rep("a", 3), rep("b", 3)),
abn = rep(c("low", "normal", "high"), 2),
stringsAsFactors = TRUE)
I want to create a table (without using any trim_rows()/prune table()) with the number of subjects
for both low and high direction (not for normal) containing the number of subjects with at least one parameter measure (either "low", "normal", "high").
map <- unique(
data[data$abn != "normal", ]
) %>%
lapply(as.character) %>%
as.data.frame()
s_fun <- function(var = "subj",
.spl_context) {
first_row <- .spl_context[1, ]
subj <- first_row$full_parent_df[[1]][["subj"]]
n <- length(unique(subj))
n <- list(n = n)
}
MY QUERY Let's update first the dataframe so that param "a" will not have nay abnormalities
data2 <- data
data2$abn[data$paramcd == "a" & data$abn != "normal"] <- "normal"
data2
In case I would not like to show in the table the params without any abnormalities (paramcd "a"), how could I achieve this without trim_rows()/prune table() ?
In case I create an empirical map deleting the a records (which is not the best approach)
we cannot obtain the correct table as for b we obtain incorrect "n"-s.
Anyway it would not be the best approach as we are creating an ad-hoc map by deleting 0 abnormality params.
map2 <- map[map$paramcd != "a", ]
basic_table() %>%
split_rows_by("paramcd", split_fun = trim_levels_to_map(map2)) %>%
split_rows_by("abn") %>%
analyze(vars = "subj", afun = make_afun(s_fun)) %>%
build_table(df = data2)
The creation of map2 is a manual process,
- Is it possible to do this by using trim_levels_in_group?
- Suppose we have data as the following, can we prune the tree after the split?
subj paramcd abn
1 subj1 a normal
2 subj2 a normal
3 subj3 a normal
4 subj4 b low
5 subj5 b normal
6 subj6 b high
We would like to create the table as the following
all obs
----------------
b
low
n 3
high
n 3
Conceptually, we would like to split by rows, paramcd at the first level, and abn at the second level
/\
/ \
a b
| /|\
But since normal is not needed, this tree is effectively pruned as
|
b
/ \
I tried also with the following approach
data3 <- data2 %>% mutate(
abn2 = factor(case_when(
abn == "low" ~ "low",
abn == "high" ~ "high",
TRUE ~ ""
),
levels = c("low", "high")
)
)
Not achieving a goal. I am obtaining an empty row for "a"
basic_table() %>%
split_rows_by("paramcd", split_fun = trim_levels_in_group("abn2", drop_outlevs = TRUE)) %>%
split_rows_by("abn2") %>%
analyze(vars = "subj", afun = make_afun(s_fun)) %>%
build_table(df = data3)
all obs
----------------
a
b
low
n 3
high
n 3
Effectively, it's like the following,
/\
/ \
a b
/ \
Here the point is that I cannot remove from the analysis dataset rows different from "low" or "high" as we need them for obtaining the correct values of "n"-s.
@shajoezhu