ciTools
ciTools copied to clipboard
add_ci.lmer chokes on "big data"
I'm finding that we cannot use add_ci.lmer for "big data". I tried an example from the mermod vignette with 200,000 observations and found that R couldn't put the new data frame into memory. Here's the example I tried:
## linear example
x_gen_mermod <- function(ng = 8, nw = 5){
n <- ng * nw
x2 <- runif(n)
group <- rep(as.character(1:ng), each = nw)
return(tibble::tibble(x2 = x2,
group = group))
}
mm_pipe <- function(tb, ...){
model.matrix(data = tb, ...)
}
get_validation_set <- function(tb, sigma, sigmaG, beta, includeRanef, groupIntercepts){
vm <- sample_n(tb, 5, replace = F)[rep(1:5, each = 100), ]
vf <- bind_rows(vm, tb) %>%
select(-group) %>%
mm_pipe(~.*.)
vf <- vf[1:500, ]
vGroups <- if(!includeRanef) rnorm(500, 0, sigmaG) else groupIntercepts[as.numeric(vm$group)]
vm[["y"]] <- vf %*% beta + vGroups + rnorm(500, mean = 0, sd = sigma)
vm
}
y_gen_mermod <- function(tb, sigma = 1, sigmaG = 1, delta = 1, includeRanef = FALSE, validationPoints = FALSE){
groupIntercepts <- rnorm(length(unique(tb$group)), 0, sigmaG)
tf <- tb %>%
dplyr::select(-group) %>%
mm_pipe(~.*.)
beta <- rep(delta, ncol(tf))
if(validationPoints) {
vm <- get_validation_set(tb, sigma, sigmaG, beta, includeRanef, groupIntercepts)
}
tb[["y"]] <- tf %*% beta + groupIntercepts[as.numeric(tb$group)] + rnorm(nrow(tb), mean = 0, sd = sigma)
tb[["truth"]] <- tf %*% beta + groupIntercepts[as.numeric(tb$group)] * (includeRanef)
if(validationPoints) return(list(tb = tb, vm = vm)) else return(tb)
}
tb <- x_gen_mermod(10, 20000) %>%
y_gen_mermod()
fit2 <- lmer(y ~ x2 + (1|group) , data = tb)
tb %>% add_ci(fit2, type = "parametric", includeRanef = TRUE, names = c("LCB", "UCB"))
Lmer works just fine on an example data set this large, but ciTools chokes and spits out
Error: cannot allocate vector of size 298.0 Gb
We need to re-examine how we are storing things in memory and see if we can do something more efficient. I'm not sure if this bug affects the other methods as well.