`loo.function` should not require a `r_eff` argument
If the user is providing a function to calculate the log-likelihood contribution of each observation, why does the user also need to pass a call to relative_eff? This just seems unnecessary.
It does seem unnecessary. We can drop that and calculate it internally.
Actually, computing r_eff requires knowing which chain each posterior draw comes from. The chain id is not a required input to loo.function because LOO just needs all draws stacked together. In other words, dropping the r_eff argument would require adding a chain_id argument to loo.function or requiring a chain_id column in draws so that r_eff can be calculated internally.
Here are three possibilities:
# drop the r_eff argument and require a chain_id argument
loo.function(x, ..., data, draws, chain_id)
# drop the r_eff argument and require chain_id to be included in draws
loo.function(x, ..., data, draws) # checks internally for a chain_id column in 'draws'
# keep the r_eff argument and add a chain_id argument so the user can
# decide whether to compute r_eff and pass it in or provide chain_id and
# let loo calculate r_eff
loo.function(x, ..., data, draws, r_eff, chain_id)
Anyone have a preference or other suggestions?
I would say (1) but is another option to require draws to be an array rather than a matrix. Or at least if it is an array, do not error if chain_id is left unspecified.
On Tue, Apr 16, 2019 at 12:30 PM Jonah Gabry [email protected] wrote:
Actually, computing r_eff requires knowing which chain each posterior draw comes from. The chain id is not a required input to loo.function because LOO just needs all draws stacked together. In other words, dropping the r_eff argument would require adding a chain_id argument to loo.function or requiring a chain_id column in draws so that r_eff can be calculated internally.
Here are three possibilities:
drop the r_eff argument and require a chain_id argument
loo.function(x, ..., data, draws, chain_id)
drop the r_eff argument and require chain_id to be included in draws
loo.function(x, ..., data, draws) # checks internally for a chain_id column in 'draws'
keep the r_eff argument and add a chain_id argument so the user can # decide whether to compute r_eff and pass it in or provide chain_id and# let loo calculate r_eff
loo.function(x, ..., data, draws, r_eff, chain_id)
Anyone have a preference or other suggestions?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/loo/issues/106#issuecomment-483735450, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqnHlbuX8DeQ0z8kyGdZM5fhrwCc0ks5vhfqlgaJpZM4cyqJe .
Yeah if draws is an array then we don’t need the chain_id. I don’t think we should require it to be an array, but we could go with option (1) above but only requiring chain_id if draws is not an array.
@avehtari and @paul-buerkner any preference here?
Now that we have posterior package, we have chain and iteration info often, and could compute r_eff based on those, and maybe then for the arrays/matrices without that info assume no r_eff computation unless additional information is provided?
Bumping this as I'm now in favor of reducing the need to enter r_eff. I would be fine with that r_eff=1 unless we do get chain and iteration information via draws object, and in those cases estimate r_eff internally
#235 reduced the need to provide r_eff. @bgoodri does that solve your problem?