StatsAPI.jl icon indicating copy to clipboard operation
StatsAPI.jl copied to clipboard

Add keyword arg to modelmatrix; define momentmatrix

Open gragusa opened this issue 3 years ago • 13 comments

  1. The modelmatrix has now a keyword weighted=false which is useful for dealing with weighted models.
  2. Add momentmatrix - this function is intended to return the matrix of estimating equations; for instance, for a linear model should return u*X, where u is the vector of residuals and X is the model matrix.

gragusa avatar Jun 15 '22 17:06 gragusa

Codecov Report

Attention: Patch coverage is 75.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 97.43%. Comparing base (20b38e1) to head (93f8742). Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
src/regressionmodel.jl 75.00% 1 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##              main      #16      +/-   ##
===========================================
- Coverage   100.00%   97.43%   -2.57%     
===========================================
  Files            3        2       -1     
  Lines           37       39       +2     
===========================================
+ Hits            37       38       +1     
- Misses           0        1       +1     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Jun 16 '22 22:06 codecov-commenter

The only thing that we probably should do is to allow for modelmatrix to take another keyword argument, e.g., droppcollinearcols, to return only the columns corresponding to non-NaN coefficients in a Pivot Cholesky.

We could do it next (after I drop a bomb-PR against GLM. The GLM PR is waiting for this PR to get merged)

gragusa avatar Jun 21 '22 14:06 gragusa

Let's tackle this separately. :-)

I'd rather review the GLM PR before merging this one, usually having the implementation is a good way to check that the API is the right one.

nalimilan avatar Jun 22 '22 19:06 nalimilan

I think it would be helpful to think about the API for dealing with rank-deficient models. For instance, ‘modelmatrix’ returns all the columns even those related to coefficients that cannot be estimated. I think this is fine as a default, but it would be useful to have a keyword argument to return only the columns corresponding to the estimable coefficients and a mechanism to identify the indexes of these columns. I have hacked these second part in this PR.

gragusa avatar Sep 10 '22 15:09 gragusa

Something that would be very useful and that I would like to add o this PR is invloglikhessian - inverse of the hessian of the likelihood. The name is not great, but invloglikelihoodhessian is a monster, and I don't like invloglikhess (but I would do anything to have this merged.

With this method defined (whose implementation for GLM is part of https://github.com/JuliaStats/GLM.jl/pull/487) CovarianceMatrices.jl could drop the GLM dependency, and implementation of all the covariances could be done via the API.

gragusa avatar Nov 17 '22 21:11 gragusa

Sure. Would it make sense to call it invhessian? Do other modeling packages define this function currently?

nalimilan avatar Nov 27 '22 15:11 nalimilan

Sure. Would it make sense to call it invhessian? Do other modeling packages define this function currently?

I don't think so. The R package dealing with robust variances uses meat and bread which is not very clear (but in that case does not matter since R is not composable and methods can be extended only by the package itself).

We already have the inverse of the log hessian of the likelihood in glm -- is invchol. For linear models, the inverse of the normal log-likelihood is (X'X)^{-1}. But for general likelihoods this is not the case.

Now, invhessian is a little too generic, but I could live with it.

gragusa avatar Dec 20 '22 17:12 gragusa

I don't have a strong preference, but at least for consistency I think we should spell "likelihood" in full if we use that term. Luckily autocompletion will almost always work. ;-)

nalimilan avatar Dec 24 '22 18:12 nalimilan

Is there anything left to decide here?

nalimilan avatar Dec 12 '24 13:12 nalimilan

I think we still need to decide. As of now, these two methods are part of https://github.com/JuliaStats/GLM.jl/pull/487. They would be helpful in extending the methods defined in CovarianceMatrices.jl

gragusa avatar Dec 12 '24 14:12 gragusa

But decide what? :-)

nalimilan avatar Dec 12 '24 16:12 nalimilan

Whether to merge?

gragusa avatar Dec 13 '24 21:12 gragusa

According to our discussions a few things needed changing AFAICT.

nalimilan avatar Dec 18 '24 11:12 nalimilan