Regularized SEM

Joshua Pritikin and Ross Jacobucci

2023-11-27

Simulate Data

We will first simulate data

library(OpenMx)

manifests <- c(paste0('x',1:8), paste0('y',1:5))

set.seed(12)

sim.mod <- mxModel(
  "sim", type="RAM", manifestVars = manifests, latentVars = 'f1',
  mxPath(paste0('x',1:8), 'f1', values=c(0,0,0,0,0,.2,.5,.8),
         labels=paste0('c',1:8)),
  mxPath('f1', paste0('y',1:5), values=1),
  mxPath(paste0('x',1:8), arrows=2, connect = "unique.bivariate",
         values=rnorm(8*7/2, sd=.2)),
  mxPath(paste0('x',1:8), arrows=2, values=1),
  mxPath(paste0('y',1:5), arrows=2, values=1),
  mxPath('f1', arrows=2, values=1, free=FALSE),
  mxPath('one', manifests),
  mxPath('one', 'f1', free=FALSE))

dat.sim = mxGenerateData(sim.mod, nrows = 100)

Run the Model

And then run the model so we can better see the structure.

run.mod <- mxModel(sim.mod, mxData(dat.sim, type="raw"))

fit <- mxRun(run.mod)
## Running sim with 67 parameters
#summary(fit)
summary(fit)$parameters[1:8,]
##   name matrix row col    Estimate Std.Error lbound ubound lboundMet uboundMet
## 1   c1      A  f1  x1  0.02937847 0.1821958     NA     NA     FALSE     FALSE
## 2   c2      A  f1  x2  0.18267962 0.1456884     NA     NA     FALSE     FALSE
## 3   c3      A  f1  x3 -0.16967274 0.1330826     NA     NA     FALSE     FALSE
## 4   c4      A  f1  x4  0.07675954 0.1387142     NA     NA     FALSE     FALSE
## 5   c5      A  f1  x5  0.06367310 0.1439198     NA     NA     FALSE     FALSE
## 6   c6      A  f1  x6  0.35343522 0.1585969     NA     NA     FALSE     FALSE
## 7   c7      A  f1  x7  0.74058477 0.1420938     NA     NA     FALSE     FALSE
## 8   c8      A  f1  x8  0.72244144 0.1489267     NA     NA     FALSE     FALSE

Regularize

One of the difficult pieces in using regularization is that the penalty has to be calibrated for each particular problem. In running this code, I first tested the default, but this was too small given that there were some parameters > 0.4. After plotting this initial run, I saw that some parameters weren’t penalized enough, therefore, I increased the penalty step to 1.2 and with 41 different values. This tested a model that had most estimates as zero. In some cases, it isn’t necessary to test a sequence of penalties that would set “large” parameters to zero, as either the model could fail to converge then, or there is not uncertainty about those parameters inclusion.

regFit <- mxPenaltySearch(mxModel(
  fit, mxPenaltyLASSO(paste0('c',1:8),"lasso",lambda.step=1.2),
  mxMatrix('Full',1,1,free=TRUE,values=0, labels="lambda")))
## Running sim with 68 parameters
## Warning: In model 'sim' Optimizer returned a non-zero status code 6. The model
## does not satisfy the first-order optimality conditions to the required
## accuracy, and no improved point for the merit function could be found during
## the final linesearch (Mx status RED)

A status code 6 warning is issued because the parameters affected by regularization have relatively large gradients.

round(regFit$output$gradient, 2)
##           c1           c2           c3           c4           c5           c6 
##        -4.41        -5.49        12.77         9.90        -7.17         5.02 
##           c7           c8  sim.A[9,14] sim.A[10,14] sim.A[11,14] sim.A[12,14] 
##         0.00         0.00         0.00         0.00         0.00         0.00 
## sim.A[13,14]   sim.S[1,1]   sim.S[1,2]   sim.S[2,2]   sim.S[1,3]   sim.S[2,3] 
##         0.00         0.01         0.00         0.01         0.00         0.01 
##   sim.S[3,3]   sim.S[1,4]   sim.S[2,4]   sim.S[3,4]   sim.S[4,4]   sim.S[1,5] 
##         0.00         0.00        -0.02        -0.01         0.01        -0.03 
##   sim.S[2,5]   sim.S[3,5]   sim.S[4,5]   sim.S[5,5]   sim.S[1,6]   sim.S[2,6] 
##         0.00         0.00         0.00         0.01         0.00         0.00 
##   sim.S[3,6]   sim.S[4,6]   sim.S[5,6]   sim.S[6,6]   sim.S[1,7]   sim.S[2,7] 
##         0.01         0.02         0.00         0.00         0.01         0.01 
##   sim.S[3,7]   sim.S[4,7]   sim.S[5,7]   sim.S[6,7]   sim.S[7,7]   sim.S[1,8] 
##        -0.01         0.00         0.00         0.01        -0.01        -0.01 
##   sim.S[2,8]   sim.S[3,8]   sim.S[4,8]   sim.S[5,8]   sim.S[6,8]   sim.S[7,8] 
##        -0.01         0.00         0.00        -0.01         0.02         0.00 
##   sim.S[8,8]   sim.S[9,9] sim.S[10,10] sim.S[11,11] sim.S[12,12] sim.S[13,13] 
##        -0.01         0.00         0.00         0.00         0.00         0.00 
##   sim.M[1,1]   sim.M[1,2]   sim.M[1,3]   sim.M[1,4]   sim.M[1,5]   sim.M[1,6] 
##        -0.01         0.02         0.02         0.02         0.01        -0.01 
##   sim.M[1,7]   sim.M[1,8]   sim.M[1,9]  sim.M[1,10]  sim.M[1,11]  sim.M[1,12] 
##         0.02         0.00         0.00         0.00         0.00         0.00 
##  sim.M[1,13] 
##         0.00

The gradient check is only done for the model with the best EBIC. Next, we can get a summary of the models tested to check if there were any optimization failures.

detail <- regFit$compute$steps$PS$output$detail
table(detail$statusCode)
## 
##                               OK                         OK/green 
##                               41                                0 
##     infeasible linear constraint infeasible non-linear constraint 
##                                0                                0 
##             iteration limit/blue                   not convex/red 
##                                0                                0 
##             nonzero gradient/red                        bad deriv 
##                                0                                0 
##                                ?                   internal error 
##                                0                                0 
##                 infeasible start 
##                                0

Looks good. We can also look at summaries of some of the results,

range(detail$lambda)
## [1]  0 48
range(detail$EBIC)
## [1] 551.1859 576.8747
best <- which(detail$EBIC == min(detail$EBIC))
detail[best, 'lambda']
## [1] 37.2

Plot the parameter trajectories

library(reshape2)
library(ggplot2)

est <- detail[,c(paste0('c',1:8), 'lambda')]
ggplot(melt(est, id.vars = 'lambda')) +
  geom_line(aes(x=lambda, y=value, color=variable)) +
  geom_vline(aes(xintercept=coef(regFit)['lambda']),
             linetype="dashed", alpha=.5)

plot of chunk unnamed-chunk-8

Here we can see that we used a large enough penalty to set most parameter estimates to zero. The best fitting model is indicated by the dashed lines.

OpenMx uses EBIC to choose a final model. See what the best fitting parameter estimates are.

summary(regFit)$parameters[1:8,]
##   name matrix row col      Estimate Std.Error lbound ubound lboundMet uboundMet
## 1   c1      A  f1  x1 -9.812532e-06        NA     NA     NA     FALSE     FALSE
## 2   c2      A  f1  x2  7.857698e-07        NA     NA     NA     FALSE     FALSE
## 3   c3      A  f1  x3 -5.665047e-07        NA     NA     NA     FALSE     FALSE
## 4   c4      A  f1  x4  4.555886e-06        NA     NA     NA     FALSE     FALSE
## 5   c5      A  f1  x5  1.301791e-06        NA     NA     NA     FALSE     FALSE
## 6   c6      A  f1  x6  8.244870e-06        NA     NA     NA     FALSE     FALSE
## 7   c7      A  f1  x7  3.217185e-01        NA     NA     NA     FALSE     FALSE
## 8   c8      A  f1  x8  3.858017e-01        NA     NA     NA     FALSE     FALSE

In this final model, we set the regression paths for x1, x2, x3, x4, x5, and x6 to zero. We also correctly identify x7 and x8 as true paths. Compare these results with the maximum likelihood estimates.