Specify Sample Size a Bit More Clearly

Open ParadaCarleton opened this issue 3 years ago • 2 comments

The current phrasing contains the same information but is a bit harder to interpret:

Computed from 4000 by 262 log-likelihood matrix

ParetoSmooth.jl is more explicit about this, with a phrasing that mentions "n observations and m posterior samples." I've found this prevents mistakes -- it's very common for users to accidentally misspecify what it is they want to leave out, e.g. by placing all their observations in one multivariate normal, which causes confusion when the whole dataset is left out. I think this might be a more useful message.

Mar 19 '22 16:03 ParadaCarleton

Hey Carlos, yeah we could be more clear about the posterior samples vs observations. Thanks for the suggestion.

Mar 21 '22 23:03 jgabry

I started to make a PR for this, but there is a challenge that the second dimension (here 262) is not necessary the number of observations, as we can leave out more than one observation, and there is no way loo package can infer that. How about

Computed from 4000 posterior draws by 262 log-likelihood terms matrix

Mar 24 '23 12:03 avehtari