SmoothingIndex doesn't de-mean returns
I believe there is a bug on line 119 of SmoothingIndex.R:
MA2 = arima(R, order=c(0,0,MAorder), method="ML", transform.pars=TRUE, include.mean=FALSE)
I believe the correct version should be
MA2 = arima(R - mean(R, na.rm = TRUE),
order=c(0,0,MAorder), method="ML", transform.pars=TRUE, include.mean=FALSE)
To see why this is the case, the preceding comments (lines 115–116) state
# include.mean: Getmansky, et al. JFE 2004 p 555 "By applying the above procedure
# to observed de-meaned returns...", so set parameter to FALSE
However, the returns (R) are never de-meaned before invoking the arima function. This results in wrong values for thetas, as shown by the example below.
set.seed(12345)
# Create random returns
unsmooth <- rnorm(25000, mean = 3, sd = 1.5)
# Smooth them with true thetas = c(0.5, 0.3, 0.2)
smooth <- stats::filter(unsmooth, c(0.5, 0.3, 0.2), sides = 1)
# Here is the buggy estimation
fit_wrong <- arima(smooth, c(0, 0, 2), include.mean = FALSE)
# Here is the corrected estimation
fit_right <- arima(smooth - mean(smooth, na.rm = TRUE), c(0, 0, 2), include.mean = FALSE)
# Calculate thetas as on line 142
thetas_wrong <- c(1, coef(fit_wrong)) / (1 + sum(coef(fit_wrong)))
thetas_right <- c(1, coef(fit_right)) / (1 + sum(coef(fit_right)))
round(thetas_wrong, 2) # 0.36 0.38 0.26 -- wrong results
round(thetas_right, 2) # 0.50 0.30 0.20 -- as expected
Alternatively, you could of course let the arima function estimate the intercept together with the autocorrelation coefficients, although I believe Getmansky et al. use the above implementation.
set.seed(12345)
fit_alt <- arima(smooth, c(0, 0, 2))
thetas_alt <- c(1, coef(fit_alt)[1:2]) / (1 + sum(coef(fit_alt)[1:2]))
round(thetas_alt, 2) # 0.50 0.30 0.20 -- as expected
Hi was wondering if this was resolved?
Nope, still not resolved. I took another look at the code, and on lines 26–28, the author writes
#' \code{include.mean}: Getmansky, et al. (2004) p 555 "By applying the above
#' procedure to observed de-meaned returns...", so we set that parameter to
#' 'FALSE'.
There's probably a misconception by the author here that include.mean = FALSE performs de-meaning. It doesn't: it merely makes arima effectively assume that the series are de-meaned to begin with.