StructuralEquationModels.jl icon indicating copy to clipboard operation
StructuralEquationModels.jl copied to clipboard

Numerical precision of model-implied covariance matrix

Open brandmaier opened this issue 4 months ago • 3 comments

I encountered a problem with the model-implied covariance matrix of a SEM in SEM.jl. The model is a standard linear latent growth curve model fitted on data perfectly simulated from that model.

What I get is this:

Σ      = fitted.model.implied.Σ  
Σ

5×5 Matrix{Float64}:
 4.24403   0.131595  0.131595   0.131595   0.131595
 0.131595  5.2176    2.16186    3.177      4.19213
 0.131595  2.16186   8.04678    6.2224     8.25267
 0.131595  3.177     6.2224    13.6249    12.3132
 0.131595  4.19213   8.25267   12.3132    20.5345

But this matrix turns out to be non-Hermitian, so I cannot use it right away to randomly draw observations from this model. To check, I found this:

 Symmetric(Σ)-Σ

5×5 Matrix{Float64}:
 0.0  0.0          0.0          0.0  0.0
 0.0  0.0          0.0          0.0  0.0
 0.0  0.0          0.0          0.0  0.0
 0.0  4.44089e-16  8.88178e-16  0.0  0.0
 0.0  0.0          0.0          0.0  0.0

Is this a known issue?

I attach the dataset and paste the model here


using CSV, DataFrames
using StructuralEquationModels

data = CSV.read("lgcm.csv", DataFrames.DataFrame;
    normalizenames = true, 
    delim          = ',',
    missingstring  = ["", "NA", "NaN"],       
    ignorerepeated = true                     
)

obs = [:y1, :y2, :y3, :y4, :y5]
lat = [:i, :s]

graph = @StenoGraph begin
    # Intercept factor: all loadings fixed to 1
    i → fixed(1)*y1 + fixed(1)*y2 + fixed(1)*y3 + fixed(1)*y4 + fixed(1)*y5

    # Slope factor: linear time scores 0,1,2,3,4
    s → fixed(0)*y1 + fixed(1)*y2 + fixed(2)*y3 + fixed(3)*y4 + fixed(4)*y5

    # Residual variances (free) and latent (co)variances (free)
    _(obs) ↔ _(obs)     # variances for observed variables
    _(lat) ↔ _(lat)     # variances + covariance for latent factors

    # Latent means (estimate μ_i and μ_s); 
    # observed means fixed to 0 by omission
    Symbol(1) → i + s
end


partable = ParameterTable(
    graph,
    latent_vars   = lat,
    observed_vars = obs
)

model = Sem(
    specification = partable,
    data          = data,        
    meanstructure = true
)

fitted = fit(model)

lgcm.csv

brandmaier avatar Oct 08 '25 12:10 brandmaier

My current fix is simply MvNormal(μ, Symmetric(Σ))

brandmaier avatar Oct 08 '25 12:10 brandmaier

In genereal this is expected, as matrix computations always have a bit of numerical imprecision, so the model implied covariance matrix we get will almost never be perfectly symmetric.

To draw normally distributed samples from a model, you can use our method for rand - consulting the help:

    (1) rand(model::AbstractSemSingle, params, n)

    (2) rand(model::AbstractSemSingle, n)

Sample normally distributed data from the model-implied covariance matrix and mean vector.

# Arguments
- `model::AbstractSemSingle`: model to simulate from.
- `params`: parameter values to simulate from.
- `n::Integer`: Number of samples.

# Examples
rand(model, start_simple(model), 100)

Does this solve the problem?

Maximilian-Stefan-Ernst avatar Oct 15 '25 11:10 Maximilian-Stefan-Ernst

https://discourse.julialang.org/t/issue-pdmats-jl-matrices-losing-positive-definiteness-with-optim-jl-autodiff/133421/18

aaronpeikert avatar Oct 29 '25 09:10 aaronpeikert