TuringGLM.jl [Tutorial] LOO-CV

There are some excellent packages for estimating Bayesian evidence for Turing models. It would allow us to perform model comparisons for various priors and model choices. We should consider supporting these options - it could be a (killer) feature!

https://github.com/treigerm/AnnealedIS.jl
https://github.com/theogf/ThermodynamicIntegration.jl
https://github.com/TuringLang/NestedSamplers.jl

LOO-CV can be viewed as a proxy for Bayesian evidence / marginal likelihood.

See Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing. https://link.springer.com/article/10.1007/s11222-016-9696-4

Feb 13 '22 21:02 yebai

Do we need support? TuringGLM.jl just returns an instantiated Turing.jl model. It also re-exports Turing.jl, so you can do anything you want with the instantiated model. There is no scaffolding or anything between you and the model once you specify it with the turing_model function.

Maybe we could convert this issue into a "call for tutorials" on those topics?

Feb 14 '22 17:02 storopoli

Yes, AIS and TI should just work. NS still misses Turing integration, but that is not related to TuringGLM. So this should be another tutorial.

Feb 14 '22 20:02 yebai

Ok, converting to a Tutorial issue.

Feb 14 '22 20:02 storopoli

LOO-CV can be viewed as a proxy for Bayesian evidence / marginal likelihood.

Clarification in case someone stumbles across this in the future: This isn't quite true.

Bayesian evidence / marginal likelihood is equal to exhaustive cross-validation, rather than leave-one-out. In exhaustive CV, you do cross-validation for all 2^n possible train-test splits. This includes some pretty weird splits, e.g. your training dataset has 0 data points, while your test set includes all of the data.

To use an analogy, LOO-CV does the same thing as AIC (estimates the expected loss). Bayes factors do something like BIC (estimate the probability that a model is the best model in a candidate set).

Feb 14 '22 21:02 ParadaCarleton

thanks, @ParadaCarleton - your clarification is correct. Sorry, I wasn't precise. For a good reference on this, see below.

Fong, E., & Holmes, C. C. (2020). On the marginal likelihood and cross-validation. Biometrika, 107(2), 489–496. https://academic.oup.com/biomet/article/107/2/489/5715611

Feb 14 '22 22:02 yebai