Calculation of the AD statistics
Hi,
Thanks for creating the package!
I'm running some tests on the AD test, and have a question on the calculation of the AD statistics.
According to the README.md or ad_test.Rd file, the AD statistics is calculated in the following way
AD = \sum_{x \in k} \left({|E(x)-F(x)| \over \sqrt{2G(x)(1-G(x))/n} }\right)^p
It seems to me that there may be two issues: 1) the formula assumes the two samples sizes are the same; and 2) the approximation of the integral is not correctly calculated.
Let the sample sizes be n1 and n2, with corresponding ecdf E and F in your notation; n=n1+n2 and G be the ecdf of the joint, when p=2, $AD = \frac{n1\times n2}{n} \int (E(x)-F(x))^2 / (G(x)(1-G(x))) d G(x)$ see F. W. Scholz, M. A. Stephens, (1987) K-Sample Anderson-Darling Tests
Let x_i denote the data in the joint sample, then the integral should be approximated by $\frac{1}{n} \sum_{i \in [n]} \frac{(E(x_i)-F(x_i))^2}{(G(x_i)(1-G(x_i)))}. $ Recall that there is extra $n1*n2/n$, if you make $n1=n2=n/2$, $AD = \frac{1}{4} \sum_{i \in [n]} \frac{(E(x_i)-F(x_i))^2}{(G(x_i)(1-G(x_i)))},$ which is different from your formula (extra $n$ is multiplied there).
Plus, tried with some simple datasets, the Test Stat returned from ad_test is related to the total sample size.
Please let me know if this makes sense, or if I am wrong.
Thanks!
I am not entirely certain that I understand your concerns, but I'll try to summarize.
My calculation has no (n1 * n2)/n out in front of the sum. It has an additional factor of sqrt(2/n) in the denominator -- corresponding to multiplying out front by n/2 when p=2 (the default). If n1=n2, then the correct adjustment is n/4 -- which is different, and when n1 != n2, the correct adjustment is just n1*n2/n which is different. I may have missed something -- please let me know if I did.
This is correct, and in that sense this does not perfectly align with the equation you attribute to Scholz et al (sorry I can't see that paper -- and I haven't immediately connected the dots to the wikipedia version).
It also means the test statistics calculated will be dependent on the relative sample sizes. However, this is a multiple of a constant factor which is the same for all resamples, so it will not affect the p-values.
If you were to take the test statistic calculated by this package and compare it to the null distribution calculated by someone else, that could cause problems -- I do not recommend doing that. Though hypothetically in that case, you could just figure out the constant factor correction and adjust for it. Alternately, if you were to compare test statistics across multiple studies for some reason, it wouldn't work well unless you adjusted for those constant factors. I also do not recommend this -- though comparing the p-values should be fine.
Frankly I went back and forth about adding the factor of 2 in the denominator for a similar reason. It had no effect on p-values, it purely worked as a constant scaling of the test statistic. I wound up adding it for aesthetic reasons -- that is variance of the joint distribution with the 2, and isn't without it. I think I ditched the n1*n2/n for consistency across the test statistics -- but honestly I'm not sure -- and obviously that created some inconsistency in which constant factors I include and don't.
tl;dr: I think you may be right, but p-values should be unaffected. Do not compare test statistics from this package to null distributions calculated elsewhere or to test statistics in other studies -- that is what the p-values are for.
Thanks for the fast response and great summary. And yes, you are correct that the p value should be unaffected.
Just a few comments to the points you have mentioned.
- Scholz's paper I mentioned corresponds to k sample AD test. You do not need to login, but just from the preview sight you can find the formula the 2 sample statistics.
- I just checked the wiki page, it has the formula for the 1 sample test, but I did not find the 2 sample case. And the last chapter on k sample test is actually based on the Scholz's paper mentioned above.
- I agree with you that the p value calculation is unaffected. It is just a matter of consistency between the statistics.