observed variables in the LMS- and QML approach

library(modsem)

The Latent Moderated Structural Equations (LMS) and the Quasi Maximum Likelihood (QML) Approach

As of version 1.0.16, observed variables are supported (in most circumstances) for both the LMS and QML approaches. Here we can see an example using only observed variables.

library(modsem)

fit <- modsem('y1 ~ x1 + z1 + x1:z1', data = oneInt, method = "lms")
summary(fit, standardized = TRUE)
#> 
#> modsem (1.0.16) ended normally after 2 iterations
#> 
#>   Estimator                                        LMS
#>   Optimization method                       EMA-NLMINB
#>   Number of model parameters                        14
#> 
#>   Number of observations                          2000
#> 
#> Loglikelihood and Information Criteria:
#>   Loglikelihood                              -12260.74
#>   Akaike (AIC)                                24549.48
#>   Bayesian (BIC)                              24627.89
#>  
#> Numerical Integration:
#>   Points of integration (per dim)                   24
#>   Dimensions                                         0
#>   Total points of integration                        1
#> 
#> Fit Measures for Baseline Model (H0):
#>                                               Standard
#>   Chi-square                                    504.08
#>   Degrees of Freedom (Chi-square)                    1
#>   P-value (Chi-square)                           0.000
#>   RMSEA                                          0.502
#>                                                       
#>   Loglikelihood                              -12512.91
#>   Akaike (AIC)                                25051.81
#>   Bayesian (BIC)                              25124.62
#>  
#> Comparative Fit to H0 (LRT test):
#>   Loglikelihood change                          252.17
#>   Difference test (D)                           504.33
#>   Degrees of freedom (D)                             1
#>   P-value (D)                                    0.000
#>  
#> R-Squared Interaction Model (H1):
#>   y1                                             0.471
#> R-Squared Baseline Model (H0):
#>   y1                                             0.320
#> R-Squared Change (H1 - H0):
#>   y1                                             0.152
#> 
#> Parameter Estimates:
#>   Coefficients                            standardized
#>   Information                                 observed
#>   Standard errors                             standard
#>  
#> 
#> Regressions:
#>                  Estimate  Std.Error  z.value  P(>|z|)  Std.all
#>   y1 ~          
#>     x1              0.055      0.034    1.619    0.105    0.398
#>     z1             -0.066      0.035   -1.902    0.057    0.329
#>     x1:z1           0.544      0.023   23.950    0.000    0.389
#> 
#> Intercepts:
#>                  Estimate  Std.Error  z.value  P(>|z|)  Std.all
#>     x1              1.023      0.024   42.827    0.000         
#>     z1              1.011      0.024   41.562    0.000         
#>     x1:z1           1.244      0.046   26.903    0.000         
#>    .y1              0.514      0.046   11.146    0.000         
#> 
#> Covariances:
#>                  Estimate  Std.Error  z.value  P(>|z|)  Std.all
#>   x1 ~~         
#>     z1              0.210      0.026    7.950    0.000    0.181
#> 
#> Variances:
#>                  Estimate  Std.Error  z.value  P(>|z|)  Std.all
#>     x1              1.141      0.036   31.628    0.000    1.000
#>     z1              1.184      0.037   31.632    0.000    1.000
#>    .y1              1.399      0.044   31.623    0.000    0.530

If you’re using an older version of modsem, the rest of the vignette details how to handle observed variables in the LMS and QML approaches. There might also be some cases where where converting observed variables to latent ones might be a more reliable option. For example, the LMS approach might struggle if it has to integrate along an observed variable.

The LMS Approach

For the LMS approach, you can use the above-mentioned method in almost all cases, except when using an observed variable as a moderating variable. In the LMS approach, you typically select one variable in an interaction term as the moderator. The interaction effect is then estimated via numerical integration at n quadrature nodes of the moderating variable. However, this process requires that the moderating variable has an error term, as the distribution of the moderating variable is modeled as ( X N(Az, ) ), where ( Az ) is the expected value of ( X ) at quadrature point k, and ( ) is the error term. If the error term is zero, the probability of observing a given value of ( X ) will not be computable.

In most instances, the first variable in the interaction term is chosen as the moderator. For example, if the interaction term is "X:Z", "X" will usually be chosen as the moderator. Therefore, if only one of the variables is latent, you should place the latent variable first in the interaction term. If both variables are observed, you must specify a measurement error (e.g., "x1 ~~ 0.1 * x1") for the indicator of the first variable in the interaction term.

# Interaction effect between a latent and an observed variable
m1 <- '
# Outer Model
  X =~ x1 # X is observed
  Z =~ z1 + z2 # Z is latent
  Y =~ y1 # Y is observed

# Inner model
  Y ~ X + Z
  Y ~ Z:X
'

lms1 <- modsem(m1, oneInt, method = "lms")

# Interaction effect between two observed variables
m2 <- '
# Outer Model
  X =~ x1 # X is observed
  Z =~ z1 # Z is observed
  Y =~ y1 # Y is observed
  x1 ~~ 0.1 * x1 # Specify a variance for the measurement error

# Inner model
  Y ~ X + Z
  Y ~ X:Z
'

lms2 <- modsem(m2, oneInt, method = "lms")
summary(lms2)
#> 
#> modsem (1.0.16) ended normally after 7 iterations
#> 
#>   Estimator                                        LMS
#>   Optimization method                       EMA-NLMINB
#>   Number of model parameters                        10
#> 
#>   Number of observations                          2000
#> 
#> Loglikelihood and Information Criteria:
#>   Loglikelihood                               -9114.81
#>   Akaike (AIC)                                18249.62
#>   Bayesian (BIC)                              18305.62
#>  
#> Numerical Integration:
#>   Points of integration (per dim)                   24
#>   Dimensions                                         1
#>   Total points of integration                       24
#> 
#> Fit Measures for Baseline Model (H0):
#>                                               Standard
#>   Chi-square                                      0.00
#>   Degrees of Freedom (Chi-square)                    0
#>   P-value (Chi-square)                              NA
#>   RMSEA                                             NA
#>                                                       
#>   Loglikelihood                               -9369.23
#>   Akaike (AIC)                                18756.46
#>   Bayesian (BIC)                              18806.87
#>  
#> Comparative Fit to H0 (LRT test):
#>   Loglikelihood change                          254.42
#>   Difference test (D)                           508.85
#>   Degrees of freedom (D)                             1
#>   P-value (D)                                    0.000
#>  
#> R-Squared Interaction Model (H1):
#>   Y                                              0.498
#> R-Squared Baseline Model (H0):
#>   Y                                              0.335
#> R-Squared Change (H1 - H0):
#>   Y                                              0.163
#> 
#> Parameter Estimates:
#>   Coefficients                          unstandardized
#>   Information                                 observed
#>   Standard errors                             standard
#>  
#> Latent Variables:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>   X =~          
#>     x1              1.000                             
#>   Z =~          
#>     z1              1.000                             
#>   Y =~          
#>     y1              1.000                             
#> 
#> Regressions:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>   Y ~           
#>     X               0.664      0.031   21.322    0.000
#>     Z               0.481      0.028   16.936    0.000
#>     X:Z             0.588      0.025   23.489    0.000
#> 
#> Intercepts:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>    .x1              1.023      0.024   42.820    0.000
#>    .z1              1.011      0.024   41.557    0.000
#>    .y1              1.056      0.034   31.315    0.000
#> 
#> Covariances:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>   X ~~          
#>     Z               0.210      0.026    7.974    0.000
#> 
#> Variances:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>    .x1              0.100                             
#>    .z1              0.000                             
#>    .y1              0.000                             
#>     X               1.041      0.036   28.851    0.000
#>     Z               1.184      0.037   31.622    0.000
#>    .Y               1.321      0.044   29.931    0.000

If you forget to specify a measurement error for the indicator of the first variable in the interaction term, you will receive an error message.

m2 <- '
# Outer Model
  X =~ x1 # X is observed
  Z =~ z1 # Z is observed
  Y =~ y1 # Y is observed

# Inner model
  Y ~ X + Z
  Y ~ X:Z
'

lms3 <- modsem(m2, oneInt, method = "lms")
#> Error: The variance of a moderating variable of integration has an indicator with zero residual variance! 
#> This will likely not work with the LMS approach, see: 
#>    `vignette('observed_lms_qml', 'modsem')` for more information. 
#> 
#> The following indicators have zero residual variance:
#>   -> x1

Note: You only get an error message for X/x1, since Z is not modelled as a moderating variable in this example.

The QML Approach

The estimation process for the QML approach differs from the LMS approach, and you do not encounter the same issue as in the LMS approach. Therefore, you don’t need to specify a measurement error for moderating variables.

m3 <- '
# Outer Model
  X =~ x1 # X is observed
  Z =~ z1 # Z is observed
  Y =~ y1 # Y is observed

# Inner model
  Y ~ X + Z
  Y ~ X:Z
'

qml3 <- modsem(m3, oneInt, method = "qml")
summary(qml3)
#> 
#> modsem (1.0.16) ended normally after 1 iterations
#> 
#>   Estimator                                        QML
#>   Optimization method                           NLMINB
#>   Number of model parameters                        10
#> 
#>   Number of observations                          2000
#> 
#> Loglikelihood and Information Criteria:
#>   Loglikelihood                               -9117.07
#>   Akaike (AIC)                                18254.13
#>   Bayesian (BIC)                              18310.14
#>  
#> Fit Measures for Baseline Model (H0):
#>                                               Standard
#>   Chi-square                                      0.00
#>   Degrees of Freedom (Chi-square)                    0
#>   P-value (Chi-square)                              NA
#>   RMSEA                                             NA
#>                                                       
#>   Loglikelihood                               -9369.23
#>   Akaike (AIC)                                18756.46
#>   Bayesian (BIC)                              18806.87
#>  
#> Comparative Fit to H0 (LRT test):
#>   Loglikelihood change                          252.17
#>   Difference test (D)                           504.33
#>   Degrees of freedom (D)                             1
#>   P-value (D)                                    0.000
#>  
#> R-Squared Interaction Model (H1):
#>   Y                                              0.470
#> R-Squared Baseline Model (H0):
#>   Y                                              0.320
#> R-Squared Change (H1 - H0):
#>   Y                                              0.150
#> 
#> Parameter Estimates:
#>   Coefficients                          unstandardized
#>   Information                                 observed
#>   Standard errors                             standard
#>  
#> Latent Variables:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>   X =~          
#>     x1              1.000                             
#>   Z =~          
#>     z1              1.000                             
#>   Y =~          
#>     y1              1.000                             
#> 
#> Regressions:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>   Y ~           
#>     X               0.605      0.028   21.257    0.000
#>     Z               0.490      0.028   17.554    0.000
#>     X:Z             0.544      0.023   23.950    0.000
#> 
#> Intercepts:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>    .x1              1.023      0.024   42.827    0.000
#>    .z1              1.011      0.024   41.562    0.000
#>    .y1              1.066      0.034   31.640    0.000
#> 
#> Covariances:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>   X ~~          
#>     Z               0.210      0.026    7.952    0.000
#> 
#> Variances:
#>                  Estimate  Std.Error  z.value  P(>|z|)
#>    .x1              0.000                             
#>    .z1              0.000                             
#>    .y1              0.000                             
#>     X               1.141      0.036   31.624    0.000
#>     Z               1.184      0.037   31.624    0.000
#>    .Y               1.399      0.044   31.623    0.000