Linear model diagnostic tests potential discrepancies
Hello, I noticed some discrepancies between model diagnostic tests, specifically the test for heteroskedasticity (nonconstant variance). I included the tests for normality and autocorrelation of residuals, for completeness, but it is the heteroskedasticity I am most concerned about due to the magnitude of the difference:
model <- lm(mpg ~ wt + cyl + gear + disp, data = mtcars)
performance::check_heteroskedasticity(model)
#> Warning: Heteroscedasticity (non-constant error variance) detected (p = 0.042).
lmtest::bptest(model) # studentized
#>
#> studentized Breusch-Pagan test
#>
#> data: model
#> BP = 6.4424, df = 4, p-value = 0.1685
lmtest::bptest(model, studentize = FALSE)
#>
#> Breusch-Pagan test
#>
#> data: model
#> BP = 7.9496, df = 4, p-value = 0.09344
shapiro.test(model$residuals)
#>
#> Shapiro-Wilk normality test
#>
#> data: model$residuals
#> W = 0.95546, p-value = 0.2056
performance::check_normality(model)
#> OK: residuals appear as normally distributed (p = 0.230).
performance::check_autocorrelation(model)
#> OK: Residuals appear to be independent and not autocorrelated (p = 0.262).
lmtest::dwtest(model, alternative = "two.sided")
#>
#> Durbin-Watson test
#>
#> data: model
#> DW = 1.7786, p-value = 0.2846
#> alternative hypothesis: true autocorrelation is not 0
Created on 2025-02-19 with reprex v2.1.1
We can look into it, but personally I would recommend against any of these tests and instead use a graphical check provided by check_model()
This is a great package and I'm really enjoying using it. I'm curious, though, why do you suggest not using the tests if they are provided? From my user perspective, I wanted to be able to use the check_model() function for it's convenience, but I also need to check and compare a lot of models - having a cutoff (like a p-value) makes it much easier to do so. In my case (and possibly others), running the individual functions (as appropriate) to capture the plot and output is preferable over check_model() function which only offers the visualizations. So, I'm curious about the reasoning behind your suggestion.
The real question is not "Are the assumptions perfectly met?" (the answer is always no), but rather "Are the violations of these assumptions severe enough to compromise the conclusions I want to draw from this model?". Knowing the nature of the violation is essential for fixing it. A plot guides you toward a solution (e.g., logging a variable, adding a polynomial term, or using a different model family), whereas a p-value leaves you in the dark.
The recommendation to prioritize check_model() is based on the principle that visual inspection provides a richer, more robust, and less arbitrary assessment of model adequacy than formal hypothesis tests.
For a great discussion of this perspective, check out Applied Linear Regression by Sandy Weisberg (any edition).