[Feature]: Implement calculation of effective degrees of freedom
Describe the solution you would like.
A generic approach to implementing effective degrees of freedom has been developed by Monahan, Thorson, Ianelli. See Jan 13, 2025 email.
Describe alternatives you have considered
current estimates of variance of random effects will benefit from this code
Statistical validity, if applicable
Thorson (2024) https://doi.org/10.1002/ecy.4327
Describe if this is needed for a management application
no
Additional context
No response
@Rick-Methot-NOAA I have a prototype of this working for an Alaska Pcod example. There are two challenges from a user standpoint.
First, the user must create a version of their model that has no data. The most straightforward way to do this would seem to be setting all data lambda to 0 in the control file. Can you confirm this is the right approach and provide any guidance on how to efficiently do that? For this cod model I get this:
cbind(mod0$likelihoods_used, mod1$likelihoods_used)
values lambdas values lambdas
TOTAL 2.43416e+02 NA -15.27930000 NA
Catch 7.20670e-11 NA 0.00000000 NA
Equil_catch 1.71963e-04 NA 0.00000000 NA
Survey -5.95181e+01 NA 0.00000000 NA
Length_comp 2.22433e+02 NA 0.00000000 NA
Age_comp 5.56440e+01 NA 0.00000000 NA
Recruitment 4.78685e+00 1 -17.15630000 1
InitEQ_Regime 1.86537e+00 1 1.87267000 1
Forecast_Recruitment 0.00000e+00 1 0.00000000 1
Parm_priors 0.00000e+00 1 0.00000000 1
Parm_softbounds 4.27402e-03 NA 0.00427121 NA
Parm_devs 1.82003e+01 1 0.00000000 1
Crash_Pen 0.00000e+00 1 0.00000000 1
Where the second two columns are the model without data. Does this seem right? I'm surprised the Parm_devs are turned off b/c I did not specify a lambda for "12=parm_dev; "
Second, the user must specify which parameters are impacted by some kind of penalty and thus informed by more than the data. Penalty could be a hyperdistribution e.g. on annual deviations (sigmaR and recdevs), or a Bayesian prior or simple parameter penalty. The user needs a way, in R, to subset those parameters. Here is some pilot code to give you a sense of this:
mod0 <- r4ss::SS_output(dir='Model_24.1/')
mod1 <- r4ss::SS_output(dir='Model_24.1_nodata/')
par_names <- mod0$parameters |>
dplyr::filter(!is.na(Active_Cnt)) |>
dplyr::pull(Label)
## declare which parameters are random effects or have priors
randoms <- "Early_InitAge|Main_RecrDev|rwalk|DEVadd"
lrandom <- grepl(randoms, par_names)
par_names2[lrandom]
# rename some of the vectors
par_names2 <- case_when(
grepl('InitAge_', par_names) ~ 'InitAge',
grepl('DEVadd', par_names) ~ 'Size_DblN_ascnd_se_survey',
grepl('RecrDev', par_names) ~ 'RecrDev',
grepl('VonBert_K_Fem_GP_1_DEV_MR_rwalk_bnd', par_names) ~ 'VonB_k',
grepl('L_at_Amin_Fem_GP_1_DEV_MR_rwalk_bnd', par_names) ~'VonB_Lmin'
)
Can you give some guidance on how a user could check/verify that they have all the parameters in there. I think the devs are obvious but the ones with priors are not.
@Cole-Monnahan-NOAA, Thanks for working on this. Two quick notes on ongoing stuff in r4ss that my be able to help with the two challenges you identify.
First, this issue opened today by @e-perl-NOAA https://github.com/r4ss/r4ss/issues/1002 could be a user-friendly approach to removing all the data from the likelihood. I think lambdas would be the most straightforward way to implement that function, but it could also be done with negative fleets (which provide expected values for each observation) or negative years (which should be the same as lambda = 0).
Second, I've just created a pull request for some table-making functions, including one that creates a table of information on all the parameters (basically reformatting the parameters table output by SS3). The that function uses Pr_type == "dev" (in https://github.com/r4ss/r4ss/blob/tables/R/table_pars.R#L80-L81) to identify all the deviation parameters parameter. The determination of which sigma applies to which parameter needs to be improved, but I think that Pr_type != "No_prior" should identify parameters with either an informative prior or a deviation penalty.
I think it may be possible to co-opt the code that does depletion_fleet to also do the needed calculations. See in SS_readcontrol.tpl: SS_Label_Info_4.14.1 #Adjust the phases to negative if beyond turn_off_phase
With depletion_fleet, SS3 adds 1 to the phase in which each parameter becomes active except for the one depletion_fleet which keeps lambda=1 for the "survey" of that fleet. So, in phase 1 there is olye that one datatype for that one fleet that has a positive lambda. I think we could so something similar to get everything turned off. In fact, as is you might run a model with lambda=0 for a depletion fleet, then read that par file back in with depletion fleet now turned on. Just brainstorming here....
Regarding parmdevs -logL. If all other data types have lambda = 0, then there are no data to tell the model that any dev has a value different than 0.0, so the logL for the parmdevs will be zero.
There already is a concept of depletion_type implemented if a survey is depletion_fleet. So, I think we could create a depletion_type==3 that does what is need for the df calculations.
if (Svy_units(f) == 34) // special code for depletion, so prepare to adjust phases and lambdas
{
echoinput << "# survey: " << f << " " << fleetname(f) << " is a depletion fleet" << endl;
depletion_fleet = f;
depletion_type = Q_setup(f, 2);
if (depletion_type == 0)
echoinput << "link_info=0; add 1 to phases of all parms; only R0 active in new phase 1 (same as 3.24 logic)" << endl;
if (depletion_type == 1)
echoinput << "link_info=1 only R0 active in phase 1; then exit; useful for data-limited draws of other fixed parameter" << endl;
if (depletion_type == 2)
echoinput << "link_info=2 no phase adjustments, can be used when profiling on fixed R0" << endl;
@iantaylor-NOAA I tried negative years but SS3 catches that as an error (rightfully so). I did not try negative fleets. Perhaps that would work.
@Rick-Methot-NOAA We evaluate this no-data model from the .par file from the full model (-maxfn 0). So the dev penalties should be non-zero b/c the devs are non-zero. My question about the lambdas was whether there was a way to tell by the output that it's working correctly. I have never used them before. Does this seem right for a model with 2 fleets?
# 10=recrdev; 11=parm_prior; 12=parm_dev; 13=CrashPen; 14=Morphcomp; 15=Tag-comp; 16=Tag-negbin; 17=F_ballpark; 18=initEQregime
#like_comp fleet phase value sizefreq_method
1 1 1 0 0
1 2 1 0 0
2 1 1 0 1
2 2 1 0 1
3 1 1 0 1
3 2 1 0 1
4 1 1 0 1
4 2 1 0 1
5 1 1 0 1
5 2 1 0 1
# 6 1 1 0 1
# 6 2 1 0 1
7 1 1 0 1
7 2 1 0 1
8 1 1 0 1
8 2 1 0 1
9 1 1 0 1
9 2 1 0 1
Perhaps it's easier for me to turn over my prototype and let you guys play with it and sort out the best way forward?
lambdas do not turn off the data. They do multiple the -logL by lambda. This means that the -log(s) terms are also being multiplied by lambda and that is probably not what you want.
Right but if I set all the lambdas for the data to 0 then that would effectively remove those components from the joint NLL. Right?
Cole, The standard SS3 logL table shows, for each data type, a row with the lambdas, then a row with the first value being sum(lambda*logL) followed by the fleet-specific logL (without lambda). e.g.
Fleet: ALL 1 2 3
.......
Surv_lambda: _ 0 1 1
Surv_like: 0.967178 0 -5.98811 6.95529
using:
SS2out << "Surv_lambda: _ " << column(surv_lambda, k) << endl
<< "Surv_like: " << surv_like * column(surv_lambda, k) << " " << surv_like << endl;
Aside from the question of how best to achieve the goals of this issue, it would be good to know what works or doesn't work with negative years. Currently the User Manual recommends using negative years and negative fleets in the "Excluding Data" section: https://nmfs-ost.github.io/ss3-doc/SS330_User_Manual_release.html#ExcludingData.
I just wrote the simple script below which uses negative years for all data types, but I see that I missed 2 things that are still contributing to the likelihood: the initial equilibrium catch (associated with year = -999) and the soft boundaries (turned on/off via a switch in the starter file).
# read model input files
s1 <- SS_read(system.file("extdata", "simple_small", package = "r4ss"))
# turn year to negative for all data (except for catch)
s1$dat$catch$year <- -1*abs(s1$dat$catch$year)
s1$dat$CPUE$year <- -1*s1$dat$CPUE$year
s1$dat$lencomp$year <- -1*s1$dat$lencomp$year
s1$dat$agecomp$year <- -1*s1$dat$agecomp$year
s1$dat$MeanSize_at_Age_obs$year <- -1*as.numeric(s1$dat$MeanSize_at_Age_obs$year)
# remove catchability parameters to avoid "Fatal Error! Q setup error; no survey obs ..."
s1$ctl$Q_options <- NULL
s1$ctl$Q_parms <- NULL
# write new output files and run revised model
SS_write(s1, dir = "simple_small_neg_year", overwrite = TRUE)
run("simple_small_neg_year", show_in_console = TRUE)
# read output
s1out <- SS_output("simple_small_neg_year")
s1out$likelihoods_used
# values lambdas
# TOTAL -2.54187e-08 NA
# Catch 0.00000e+00 NA
# Equil_catch 7.77429e-12 NA
# Recruitment 0.00000e+00 1
# InitEQ_Regime 0.00000e+00 1
# Forecast_Recruitment 0.00000e+00 1
# Parm_priors 0.00000e+00 1
# Parm_softbounds -2.54265e-08 NA
# Parm_devs 0.00000e+00 1
# Crash_Pen 0.00000e+00 1
I think that we want to include the equilibrium catch because that's basically a penalty right?
Worth pinging @James-Thorson-NOAA to provide any further context.
In any case I think that I can pass the script off to you and let you guys iterate and improve. I'm over my head on the SS3 details here.
Apparently I can't attach a .R file so I've pasted it below. I'd be curious to see if you guys can get it to work on some different models.
Thanks, Cole
# A quick demo of calculating the conditional AIC (cAIC) value
# for a stock synthesis model (ss3)
# Started 4/2025 by Cole Monnahan ([email protected])
# The code is an implementation of equation 6 (method 2) from
# https://arxiv.org/abs/2411.14185 and adapted from code provided
# by James Thorson. For interpretation of cAIC and effective
# degrees of freedom (EDF) see https://doi.org/10.1002/ecy.4327
# To make this calculation we need (1) two Hessian matrices, (2)
# a logical vector identifying which parameters are informed by
# more than the data via priors/penalties, including random
# effects, and (3) a vector to group the parameters.
# The first Hessian is from the original model at the mode (after
# optimization). The second Hessian is from the same model but
# without any data, and is calculated at the same parameter
# vector from the first model, i.e. the mode or MLE, and thus is
# not optimized.
# Creating a model without any data is non-trivial to do in SS3
# and ongoing work. For this demo I created the second model by
# copying the folder for the original and modifying the .ctl file
# to have lambda=0 for all data components. I also copied over
# the ss3.par file and start from that. The no-data model is run
# with -maxfn 0 from the command line to get the data-free
# Hessian in that folder. The user is responsible for correctly
# removing the data from the model, and should recognize that if
# not done correctly the cAIC and EDF outputs will be wrong.
# Finally, identifying which parameters are informed by more than
# the data is also done manually. Any vector of "devs" or "random
# effects" will count, as they have a hierarchical penalty. For
# instance sigmaR penalizes the recdevs. Any parameter with a
# penalty or prior would also count. The user must specify which
# parameters these are. Since SS3 appends the year onto dev
# parameters e.g. "Main_RecrDev_1980" these will not be unique.
# It is useful to be about to group by parameter name and sum
# across EFD to calculate the EDF for that parameter. So a second
# character vector is needed to do that grouping. The user must
# do this manually as well.
library(r4ss)
library(dplyr)
library(adnuts)
library(tidyr)
library(ggplot2)
# Paths to two model versions
p0 <- 'Model_24.1/' # has data
p1 <- 'Model_24.1_nodata' # does not have data
mod0 <- r4ss::SS_output(dir=p0)
# starts from the mod0 ss3.par file, and was run with '-maxfn 0'
mod1 <- r4ss::SS_output(dir=p1) # without data
par_names <- mod0$parameters |>
filter(!is.na(Active_Cnt)) |> pull(Label)
head(par_names)
# check that the likeilhoods are turned off correctly
mod1$likelihoods_by_fleet
cbind(mod0$likelihoods_used, mod1$likelihoods_used)
# declare which parameters are random effects or have priors.
# This is done manually and carefully by examining par_names and
# using your understanding of the parameterization of the model.
randoms <- "Early_InitAge|Main_RecrDev|rwalk|DEVadd"
lrandom <- grepl(randoms, par_names)
# rename some of the vectors for grouping later
par_names2 <- case_when(
grepl('InitAge_', par_names) ~ 'InitAge',
grepl('DEVadd', par_names) ~ 'Size_DblN_ascnd_se_survey',
grepl('RecrDev', par_names) ~ 'RecrDev',
grepl('VonBert_K_Fem_GP_1_DEV_MR_rwalk_bnd', par_names) ~ 'VonB_k',
grepl('L_at_Amin_Fem_GP_1_DEV_MR_rwalk_bnd', par_names) ~'VonB_Lmin'
)
par_names2[lrandom] # make sure this looks right!
# Now get the two Hessian matrices, and subset them down to the
# parameter identified above.
Hess0 <- adnuts:::.getADMBHessian(p0)[lrandom,lrandom]
Hess1 <- adnuts:::.getADMBHessian(p1)[lrandom,lrandom]
negEDF <- diag(solve(Hess0, Hess1)) # this is the key calculation
# can now calculate cAIC and EDF
q <- sum(lrandom) # no. of random effects (ish)
p <- sum(1-lrandom) # no. of fixed effects
jnll0 <- mod0$likelihoods_used[1,1]
jnll1 <- mod1$likelihoods_used[1,1]
cnll <- jnll0 - jnll1
## conditional AIC (new calculation)
cAIC <- 2*cnll + 2*(p+q) - 2*sum(negEDF)
# standard marginal AIC
mAIC <- 2*(p+q) - 2*(-jnll0)
out <- round(data.frame(n_fixed=p, EDF=sum(1-negEDF), n_total=p+sum(1-negEDF),
cAIC=cAIC, mAIC=mAIC),1)
out
# Can be useful/interesting to group by variable name. Index will
# often be year but not always.
edf <- data.frame(par=par_names2[lrandom], n=1, edf=1-negEDF) |> group_by(par) |>
mutate(index=1:n()) |> ungroup()
ggplot(edf, aes(index, y=edf)) + geom_point() +
facet_wrap('par')
tab <- edf |> group_by(par) |> summarize(n=sum(n), edf=sum(edf)) |>
arrange(desc(edf)) |>
bind_rows(edf |> summarize(par='Total', n=sum(n), edf=sum(edf))) |>
mutate(pct=100*edf/n)
gt::gt(tab) |> gt::fmt_number(columns = 3:4, decimals = 1)
Hi Cole, Thanks for including your model. My challenges working in R are on par with yours in SS3 code, so we may be mutually stalled for awhile as Ian and I have some other big efforts underway. I'll keep this on my to-do list. Rick
@Cole-Monnahan-NOAA, would you mind including the output of your script or sending me the two model files that you used so that I can run the script on my computer?
@e-perl-NOAA I sent a G-drive folder link for you three to play with it. Please don't share them without permission.