HDI?
Would it be in the scope of posterior to provide support for HDI's in contrast with quantile-based intervals? For example, say I wanted to quickly summarize my draws by their modes and 95% intervals.
example_draws() %>%
summarise_draws(mode = Mode, ~ quantile(.x, probs = c(.025, .975)))
# A tibble: 10 × 4
variable mode `2.5%` `97.5%`
<chr> <dbl> <dbl> <dbl>
1 mu 5.91 -2.16 10.2
2 tau 1.74 0.174 14.6
3 theta[1] 5.67 -2.81 22.7
4 theta[2] 5.44 -3.24 14.9
5 theta[3] 5.52 -15.0 13.2
6 theta[4] 5.94 -4.64 14.8
7 theta[5] 4.26 -8.68 12.3
8 theta[6] 3.20 -6.80 12.8
9 theta[7] 4.46 -2.57 18.1
10 theta[8] 4.22 -6.53 14.8
There's great support for percentile-based intervals. Is there support for HDI's?
There is no support for HDIs yet but it might be a reasonable addition. The problem is the HDI rely on density estimates which require many more draws in order to be reliable enough.
Great! I acknowledge the limitations of HDIs, but it would be very nice if they were an option.
I'd be curious if anyone out there knows anything about the literature on HDI estimation methods? In {ggdist} I've been using the two methods from {HDInterval} and the more I look at them the less satisfied I am. Even with a large sample (> 10,000) there still seems to be a lot of variance. I think if we're going to implement something here it would be worth a bit of a dive into the literature to pick the best method first (then I can also update ggdist to use that method via posterior).
https://twitter.com/SolomonKurz/status/1478864812394397696
Even with a large sample (> 10,000) there still seems to be a lot of variance.
I would have assumed there is not much more variance than for quantiles, and that the most of the big variance comes from using quite extreme quantiles.
Andrew et al has a paper "Simulation-efficient Shortest Probability Intervals" http://www.stat.columbia.edu/~gelman/research/published/spin.pdf and CRAN R package SPIn https://cran.r-project.org/web/packages/SPIn/index.html SPIn reduces the variability a bit, and the paper illustrates the challenge of reducing the variance without additional assumptions about the target distribution. After the paper had been published, Andrew started to think that HDI is not that useful, and thus hasn't advocated the paper and the package. In that paper the focus was on 95% HDI, and since then Andrew has advocated, e.g. 20% and 80% quantiles as they have less variance.
Anyway, as the code for SPIn is available and based on quite well made paper, it would be "easy" to add. SPIn package has been last updated in 2013, so maybe it would better to just copy the relevant code to the posterior package?
FWIW, the bayestestR package has a recent PR that adds support for the posterior package (https://github.com/easystats/bayestestR/pull/518), i.e. all functions like hdi(), eti(), point_estimate() etc. now work on objects of class draws. There's probably some overlap between functions in bayestestR and posterior, but the scope of the packages is different and they rather complement, I'd say.
The bayestestR package documentation for hdi() doesn't tell which algorithm is used for HDI.
Anyway, I think it's great if bayestestR will support posterior objects.