Lowest nonzero p-value and float64
Hey,
In my Wald tests, lowest non-zero p-value I get is 2.220446049250313e-16. I'm using master branch and default dtype seems float64. I was wondering why it's not lower. Given that I have zero p-values, I should be able to see lower p-values with float64, no?
Hey,
In my Wald tests, lowest non-zero p-value I get is 2.220446049250313e-16. I'm using master branch and default dtype seems float64. I was wondering why it's not lower. Given that I have zero p-values, I should be able to see lower p-values with float64, no?
Sorry for the delay @gokceneraslan, I think I have not propgated the precision all the way into the GLM backend for the numpy/dask default optimiser. Otherwise, this may also occur at the level of the evaluation of the test statistic distribution, will look into this. For now, maybe think about whether you really need this to move forward? At that level of significance, I usually threshold based on mean effects (in addition to pvalues) rather than distinguishing between extremely low pvalues because they are more meaningful in my opinion.
Hey @davidsebfischer, yeah I mean, not that I'm super interested in p-values around 10^-180, but it's just a technical error to have relatively low granularity in p-values, which ideally shouldn't happen with float64 dtype.
Easy fix to this? I want to make a volcano plot but I have a bunch of p-values at 0.
Seems I have answered my own question. Very easy fix. in stats.py if you change pvals = 2 * (1 - scipy.stats.norm(loc=0, scale=1).cdf(wald_statistic)) to pvals = 2 * (scipy.stats.norm(loc=0, scale=1).sf(wald_statistic))
it fixes the issue. The survival function is the same thing as 1 - cdf, but the precision is much higher. Luckily I've spent a few hours trying to figure this out in a completely unrelated package!
@davidsebfischer would be nice if you could push this update so I don't have to manually fix my ever growing accumulation of conda environments. Cheers!