modelbased icon indicating copy to clipboard operation
modelbased copied to clipboard

Harmonize ggeffects and modelbased

Open strengejacke opened this issue 2 years ago • 0 comments

@DominiqueMakowski This is a draft for ggeffects, WDYT? From the docs:

predict_response() is a wrapper around ggpredict(), ggeffect(), ggemmeans() and ggaverage(). Depending on the value of the marginalize argument, predict_response() calls one of those functions, sometimes with different arguments. The marginalize argument indicates how to marginalize over the non-focal predictors, i.e. those variables that are not specified in terms. Possible values are:

  • "mean_reference": calls ggpredict(), i.e. non-focal predictors are set to their mean (numeric variables) or reference level (factors, or "lowest" value in case of character vectors).

  • "mean_mode": calls ggpredict(typical = c(numeric = "mean", factor = "mode")), i.e. non-focal predictors are set to their mean (numeric variables) or mode (factors, or "most common" value in case of character vectors).

  • "marginalmeans": calls ggemmeans(), i.e. non-focal predictors are set to their mean (numeric variables) or marginalized over the levels or "values" for factors and character vectors. Marginalizing over the factor levels of non-focal terms computes a kind of "weighted average" for the values at which these terms are hold constant.

  • "empirical" (or "counterfactual"): calls ggaverage(), i.e. non-focal predictors are marginalized over the observations in your sample. Technically, ggaverage() calculates predicted values for each observation in the data multiple times (the data is duplicated once for all unique values of the focal terms), each time fixing one unique value or level of the focal terms and then takes the average of these predicted values (aggregated/grouped by the focal terms). These kind of predictions are also called "counterfactual" predictions (Dickerman and Hernan 2020). There is a more detailed description in this vignette.

In marginaleffects, that would be something like:

predictions(newdata = "means", by = ...), predictions(by = ...), predictions(newdata = "marginalmeans", by = ...), avg_predictions(variables = ...)

This looks like the four most common "marginalization" methods?

strengejacke avatar Feb 15 '24 08:02 strengejacke