Support for glmmTMB count component conditioned on the zi-component
One thing that is rather difficult is the "overal" estimated means from a model, i.e. the count component conditioned on the zi-component:
library(glmmTMB)
library(modelbased)
library(ggeffects)
model <- glmmTMB(
count ~ mined + (1 | site),
ziformula = ~mined,
family = poisson,
data = Salamanders
)
# count model
ggpredict(model, "mined")
#>
#> # Predicted counts of count
#> # x = mined
#>
#> x | Predicted | SE | 95% CI
#> -------------------------------------
#> yes | 1.09 | 0.23 | [0.69, 1.72]
#> no | 3.42 | 0.09 | [2.86, 4.09]
#>
#> Adjusted for:
#> * site = NA (population-level)
#> Standard errors are on link-scale (untransformed).
estimate_means(model)
#> mined | rate | SE | 95% CI
#> ----------------------------------
#> no | 3.42 | 0.31 | [2.86, 4.09]
#> yes | 1.09 | 0.25 | [0.69, 1.72]
# zero-inflated model
ggpredict(model, "mined", type = "zi_prob")
#>
#> # Predicted zero-inflation probabilities of count
#> # x = mined
#>
#> x | Predicted | SE | 95% CI
#> -------------------------------------
#> yes | 0.76 | 0.24 | [0.66, 0.83]
#> no | 0.36 | 0.12 | [0.30, 0.41]
#>
#> Adjusted for:
#> * site = NA (population-level)
#> Standard errors are on link-scale (untransformed).
estimate_means(model, component = "zi")
#> mined | Mean | SE | 95% CI
#> ----------------------------------
#> no | 0.36 | 0.03 | [0.30, 0.41]
#> yes | 0.76 | 0.04 | [0.66, 0.83]
# model conditioning both on count- and zero-inflated model
ggpredict(model, "mined", type = "zero_inflated")
#>
#> # Predicted counts of count
#> # x = mined
#>
#> x | Predicted | SE | 95% CI
#> -------------------------------------
#> yes | 0.26 | 0.08 | [0.11, 0.42]
#> no | 2.21 | 0.22 | [1.75, 2.66]
#>
#> Adjusted for:
#> * site = NA (population-level)
#> Standard errors are on link-scale (untransformed).
Created on 2020-06-22 by the reprex package (v0.3.0)
If model is of class glmmTMB, hurdle, zeroinfl or zerotrunc, simulations from a multivariate normal distribution (see ?MASS::mvrnorm) are drawn to calculate mu*(1-p). Confidence intervals are then based on quantiles of these results. For type = "re.zi", prediction intervals also take the uncertainty in the random-effect paramters into account (see also Brooks et al. 2017, pp.391-392 for details).
Brooks ME, Kristensen K, Benthem KJ van, Magnusson A, Berg CW, Nielsen A, et al. glmmTMB Balances Speed and Flexibility Among Packages for Zero-inflated Generalized Linear Mixed Modeling. The R Journal. 2017;9: 378-400.
Originally posted by @strengejacke in https://github.com/easystats/modelbased/issues/66#issuecomment-647532231
What would be necessary to support it correctly? Does emmeans deals with it alright?
No, I don't think this is supported by emmeans yet (resp. glmmTMB, which implements emmeans functionality).
It's not that hard, I'd say...
- simulate predictions:
prdat.sim <- .simulate_predictions(model, newdata, nsim, terms, value_adjustment, condition)(https://github.com/strengejacke/ggeffects/blob/f5f3056103940ecb53b8a44496e1804b29de7d39/R/predict_zero_inflation.R#L161) - calculate predictions for "overall" model:
sims <- exp(prdat.sim$cond) * (1 - stats::plogis(prdat.sim$zi)) - compute quantiles (SD, 95% quantiles for CI...):
prediction_data <- .join_simulations(data_grid, newdata, prdat, sims, ci, cleaned_terms)(https://github.com/strengejacke/ggeffects/blob/f5f3056103940ecb53b8a44496e1804b29de7d39/R/predict_zero_inflation.R#L2).
The code I used was "evolving" over time, so could possibly be more clean, but it is stable and works with edge cases ;-)
What you see under 2), is what you already get. However, the CIs/SEs are wrong, that's why all the simulation stuff is done here.
is this issue still valid? Does it have anything to do with get_predicted?
Yes, if you want to have the "correct" CI, we still haven't addressed this, I think.