loo icon indicating copy to clipboard operation
loo copied to clipboard

`loo.function` should not require a `r_eff` argument

Open bgoodri opened this issue 6 years ago • 7 comments

If the user is providing a function to calculate the log-likelihood contribution of each observation, why does the user also need to pass a call to relative_eff? This just seems unnecessary.

bgoodri avatar Apr 16 '19 15:04 bgoodri

It does seem unnecessary. We can drop that and calculate it internally.

jgabry avatar Apr 16 '19 16:04 jgabry

Actually, computing r_eff requires knowing which chain each posterior draw comes from. The chain id is not a required input to loo.function because LOO just needs all draws stacked together. In other words, dropping the r_eff argument would require adding a chain_id argument to loo.function or requiring a chain_id column in draws so that r_eff can be calculated internally.

Here are three possibilities:

# drop the r_eff argument and require a chain_id argument
loo.function(x, ..., data, draws, chain_id)

# drop the r_eff argument and require chain_id to be included in draws
loo.function(x, ..., data, draws) # checks internally for a chain_id column in 'draws'

# keep the r_eff argument and add a chain_id argument so the user can 
# decide whether to compute r_eff and pass it in or provide chain_id and
# let loo calculate r_eff
loo.function(x, ..., data, draws, r_eff, chain_id)

Anyone have a preference or other suggestions?

jgabry avatar Apr 16 '19 16:04 jgabry

I would say (1) but is another option to require draws to be an array rather than a matrix. Or at least if it is an array, do not error if chain_id is left unspecified.

On Tue, Apr 16, 2019 at 12:30 PM Jonah Gabry [email protected] wrote:

Actually, computing r_eff requires knowing which chain each posterior draw comes from. The chain id is not a required input to loo.function because LOO just needs all draws stacked together. In other words, dropping the r_eff argument would require adding a chain_id argument to loo.function or requiring a chain_id column in draws so that r_eff can be calculated internally.

Here are three possibilities:

drop the r_eff argument and require a chain_id argument

loo.function(x, ..., data, draws, chain_id)

drop the r_eff argument and require chain_id to be included in draws

loo.function(x, ..., data, draws) # checks internally for a chain_id column in 'draws'

keep the r_eff argument and add a chain_id argument so the user can # decide whether to compute r_eff and pass it in or provide chain_id and# let loo calculate r_eff

loo.function(x, ..., data, draws, r_eff, chain_id)

Anyone have a preference or other suggestions?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/loo/issues/106#issuecomment-483735450, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqnHlbuX8DeQ0z8kyGdZM5fhrwCc0ks5vhfqlgaJpZM4cyqJe .

bgoodri avatar Apr 16 '19 18:04 bgoodri

Yeah if draws is an array then we don’t need the chain_id. I don’t think we should require it to be an array, but we could go with option (1) above but only requiring chain_id if draws is not an array.

@avehtari and @paul-buerkner any preference here?

jgabry avatar Apr 16 '19 22:04 jgabry

Now that we have posterior package, we have chain and iteration info often, and could compute r_eff based on those, and maybe then for the arrays/matrices without that info assume no r_eff computation unless additional information is provided?

avehtari avatar Mar 24 '23 13:03 avehtari

Bumping this as I'm now in favor of reducing the need to enter r_eff. I would be fine with that r_eff=1 unless we do get chain and iteration information via draws object, and in those cases estimate r_eff internally

avehtari avatar Nov 29 '23 17:11 avehtari

#235 reduced the need to provide r_eff. @bgoodri does that solve your problem?

avehtari avatar Feb 17 '24 17:02 avehtari