adcomp
adcomp copied to clipboard
Poisson regression not working as expected
Hi,
I'm just trying to fit Poisson linear regression with a random walk and overdispersion from a generated data set. Everything runs, but I'm getting NaNs in the error. Is there something I'm doing obviously wrong?
Here is the cpp code:
#include <TMB.hpp>
using namespace density;
template<class Type>
Type objective_function<Type>::operator() ()
{
// SIMULATED DATA FOR POISSON REGRESSION
// X ~ Po(deaths)
// log deaths = alpha_0 + beta_0 * t + pi_t + overdispersion
// DATA
DATA_MATRIX(log_counts); // matrix of log of counts for single age group in multiple states,
// with time across and states downwards
size_t T = log_counts.cols(); // number of time points
size_t N = log_counts.rows(); // number of states
// PARAMETERS
// intercepts
// PARAMETER(alpha_0); //implicit in random walk
// slopes
PARAMETER(beta_0); // global slope
// precisions
PARAMETER(log_tau_rw); // log precision of rw1
PARAMETER(log_tau_epsilon); // log precision of overdispersion
// ESTIMATED OUTPUT
PARAMETER_MATRIX(log_counts_pred); // estimated count
PARAMETER_MATRIX(pi);
// INITIALISED NEGATIVE LOG-LIKELIHOOD
Type nll = Type(0.0);
// TRANSFORM PRECISIONS
Type log_sigma_rw = (Type(-1) * log_tau_rw) / Type(2) ;
Type log_sigma_epsilon = (Type(-1) * log_tau_epsilon) / Type(2) ;
// RANDOM WALK
for (size_t n = 0; n < N; n++) {
for (size_t t = 1; t < T; t++) {
nll -= dnorm(pi(n,t), pi(n,t-1), exp(log_sigma_rw), TRUE);
}
}
// PREDICTION
for (size_t n=0; n < N; n++) {
for (size_t t=0; t < T; t++) {
nll -= dnorm(log_counts_pred(n,t), beta_0 * (t + 1) + pi(n,t), exp(log_sigma_epsilon), TRUE);
}
}
// data likelihood
for (size_t n=0; n < N; n++) {
for (size_t t=0; t < T; t++) {
nll -= dpois(log_counts(n,t), log_counts_pred(n,t), TRUE);
}
}
return nll;
}
Here's the R code to generate the data and fit the model:
library(TMB)
# series length (T) and number of different series (N)
T <- 1000
N <- 1
# parameters
alpha_0 <- 10
beta_0 <- -0.0001
prec_rw1 <- 100
prec_e <- 300
# generate random walk
generate.rw1 <- function(prec) {
set.seed(123)
dummy <-c(0,cumsum(rnorm(n=T-1, mean=0,sd=1/sqrt((prec)))))
return(dummy)
}
# generate random walk and overdispersion
rw1 <- replicate(N,generate.rw1(prec_rw1))
od <- rnorm(T, mean = 0, sd = sqrt(1 / prec_e))
# generate covariate values
t <- rep(seq(1:T),N)
# compute mu values
real_lambda <- exp(alpha_0 + beta_0 * t + rw1 + od)
counts <- rpois(n=T*N, lambda=real_lambda)
dat <- data.frame(t=t,counts=counts)
# create matrix with data
log_counts <- matrix(log(dat$counts),N,T)
# compile cpp file
compile('temporal_rw1_drift.cpp')
dyn.load(dynlib('temporal_rw1_drift'))
# prepare list of parameters for TMB
data <- list(log_counts = log_counts)
parameters <- list(beta_0=1.,log_tau_rw=4.,log_tau_epsilon=1.,log_counts_pred=matrix(1.,N,T), pi=matrix(0.,N,T) )
# run TMB model on simulated data
obj <- MakeADFun(data, parameters, random = "pi", DLL='temporal_rw1_drift')
obj$hessian <- FALSE
opt <- do.call("optim", obj)
sd <- sdreport(obj)
# extract fixed effects
fixed <- summary(sd, 'fixed')
Many thanks. It would help me a lot.
R
The right approach is to use the debugger to trap the first occurrence of NaN. See:
http://kaskr.github.io/adcomp/group__Errors.html
(scroll down to Floating point execption).
Hans