Multilevel CFA (MLCFA) in R, part 2

Last updated on Jan 21, 2023 10 min read

A while back, I wrote a note about how to conduct a multilevel confirmatory factor analysis (MLCFA) in R. Part of the note shows how to setup lavaan to be able to run the MLCFA model. NOTE: one of the important aspects of an MLCFA is that the factor structure at the two levels may not be the same– that is the factor structures are invariant across levels. The setup process is/was cumbersome– but putting the note together was informative. Testing a 2-1 factor model (i.e., 2 factors at the first level and 1 factor at the second level) required the following code (see the original note for the detailed explanation of the setup and what the variables represent). [This is a measure of school engagement; n = 3,894 students in 254 schools].

1. Running an MLCFA manually

Read in the data

library(lavaan)

## Warning: package 'lavaan' was built under R version 3.4.4

## This is lavaan 0.6-3

## lavaan is BETA software! Please report any bugs.

source('http://faculty.missouri.edu/huangf/data/mcfa/mcfa.R')
raw <- read.csv("http://faculty.missouri.edu/huangf/data/mcfa/raw.csv")
head(raw)

##   x1 x2 x3 x4 x5 x6 sid
## 1  4  4  3  4  4  3  55
## 2  6  6  4  6  4  4  55
## 3  5  5  2  6  4  4  55
## 4  4  4  5  5  3  3  55
## 5  4  4  4  4  3  2  55
## 6  4  4  2  4  2  2  55

Just as a reminder, here are some item statistics:

Prepare the data for use with the mcfa.input function to create the necessary matrices and store them in x.

x <- mcfa.input("sid", raw) #sid is the school id or clustering variable
combined.cov <- list(within = x$pw.cov, between = x$b.cov)
combined.n <- list(within = x$n - x$G, between = x$G)
x$icc #view intraclass correlation (ICCs) of the six variables

##    x1    x2    x3    x4    x5    x6 
## 0.199 0.247 0.145 0.108 0.248 0.125

Specify the model:

level2.1factor <- '
  f1 =~ x1 + c(a,a)*x2 + c(b,b)*x4
  f2 =~ x3 + c(c,c)*x5 + c(d,d)*x6
 
  x1 ~~ c(e,e)*x1
  x2 ~~ c(f,f)*x2
  x3 ~~ c(g,g)*x3
  x4 ~~ c(h,h)*x4
  x5 ~~ c(i,i)*x5
  x6 ~~ c(j,j)*x6
  f1 ~~ c(k,k)*f1
  f2 ~~ c(l,l)*f2
  f1 ~~ c(m,m)*f2
  x1b =~ c(0,3.91)*x1
  x1b ~~ c(0,NA)*x1b
  x2b =~ c(0,3.91)*x2
  x2b ~~ c(0,NA)*x2b
  x3b =~ c(0,3.91)*x3
  x3b ~~ c(0,NA)*x3b
  x4b =~ c(0,3.91)*x4
  x4b ~~ c(0,NA)*x4b
  x5b =~ c(0,3.91)*x5
  x5b ~~ c(0,NA)*x5b
  x6b =~ c(0,3.91)*x6
  x6b ~~ c(0,NA)*x6b
 
  bf1 =~ c(0,1)*x1b  + c(0,NA)*x2b + c(0,NA)*x3b + c(0,NA)*x4b  +
     c(0,NA)*x5b + c(0,NA)*x6b
  bf1 ~~ c(0,NA)*bf1 + c(0,0)*f1 + c(0,0)*f2
'

The specified model looks like:

Run the model:

results6 <- cfa(level2.1factor, sample.cov = combined.cov,
 sample.nobs = combined.n, orthogonal = T)
summary(results6, fit.measures = T, standardized = T)

## lavaan 0.6-3 ended normally after 54 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                         38
##   Number of equality constraints                    13
## 
##   Number of observations per group         
##   within                                          3640
##   between                                          254
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                     142.465
##   Degrees of freedom                                17
##   P-value (Chi-square)                           0.000
## 
## Chi-square for each group:
## 
##   within                                        56.924
##   between                                       85.541
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic            11054.075
##   Degrees of freedom                                30
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.989
##   Tucker-Lewis Index (TLI)                       0.980
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)             -27509.959
##   Loglikelihood unrestricted model (H1)     -27438.727
## 
##   Number of free parameters                         25
##   Akaike (AIC)                               55069.919
##   Bayesian (BIC)                             55226.599
##   Sample-size adjusted Bayesian (BIC)        55147.160
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.062
##   90 Percent Confidence Interval          0.052  0.071
##   P-value RMSEA <= 0.05                          0.019
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.024
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard Errors                             Standard
## 
## 
## Group 1 [within]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   f1 =~                                                                 
##     x1                1.000                               0.694    0.882
##     x2         (a)    1.081    0.021   51.609    0.000    0.750    0.873
##     x4         (b)    0.649    0.024   27.156    0.000    0.450    0.454
##   f2 =~                                                                 
##     x3                1.000                               0.688    0.634
##     x5         (c)    1.146    0.030   38.404    0.000    0.789    0.819
##     x6         (d)    1.317    0.034   38.786    0.000    0.906    0.868
##   x1b =~                                                                
##     x1                0.000                               0.000    0.000
##   x2b =~                                                                
##     x2                0.000                               0.000    0.000
##   x3b =~                                                                
##     x3                0.000                               0.000    0.000
##   x4b =~                                                                
##     x4                0.000                               0.000    0.000
##   x5b =~                                                                
##     x5                0.000                               0.000    0.000
##   x6b =~                                                                
##     x6                0.000                               0.000    0.000
##   bf1 =~                                                                
##     x1b               0.000                                 NaN      NaN
##     x2b               0.000                                 NaN      NaN
##     x3b               0.000                                 NaN      NaN
##     x4b               0.000                                 NaN      NaN
##     x5b               0.000                                 NaN      NaN
##     x6b               0.000                                 NaN      NaN
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   f1 ~~                                                                 
##     f2         (m)    0.301    0.013   24.014    0.000    0.630    0.630
##     bf1               0.000                                 NaN      NaN
##   f2 ~~                                                                 
##     bf1               0.000                                 NaN      NaN
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .x1         (e)    0.137    0.008   17.000    0.000    0.137    0.222
##    .x2         (f)    0.176    0.010   18.328    0.000    0.176    0.238
##    .x3         (g)    0.704    0.019   37.860    0.000    0.704    0.598
##    .x4         (h)    0.781    0.019   41.160    0.000    0.781    0.794
##    .x5         (i)    0.306    0.012   25.779    0.000    0.306    0.330
##    .x6         (j)    0.270    0.014   19.493    0.000    0.270    0.247
##     f1         (k)    0.481    0.016   30.196    0.000    1.000    1.000
##     f2         (l)    0.473    0.024   20.059    0.000    1.000    1.000
##     x1b               0.000                                 NaN      NaN
##     x2b               0.000                                 NaN      NaN
##     x3b               0.000                                 NaN      NaN
##     x4b               0.000                                 NaN      NaN
##     x5b               0.000                                 NaN      NaN
##     x6b               0.000                                 NaN      NaN
##     bf1               0.000                                 NaN      NaN
## 
## 
## Group 2 [between]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   f1 =~                                                                 
##     x1                1.000                               0.694    0.407
##     x2         (a)    1.081    0.021   51.609    0.000    0.750    0.360
##     x4         (b)    0.649    0.024   27.156    0.000    0.450    0.270
##   f2 =~                                                                 
##     x3                1.000                               0.688    0.343
##     x5         (c)    1.146    0.030   38.404    0.000    0.789    0.345
##     x6         (d)    1.317    0.034   38.786    0.000    0.906    0.496
##   x1b =~                                                                
##     x1                3.910                               1.511    0.887
##   x2b =~                                                                
##     x2                3.910                               1.900    0.911
##   x3b =~                                                                
##     x3                3.910                               1.685    0.841
##   x4b =~                                                                
##     x4                3.910                               1.341    0.804
##   x5b =~                                                                
##     x5                3.910                               2.073    0.907
##   x6b =~                                                                
##     x6                3.910                               1.497    0.820
##   bf1 =~                                                                
##     x1b               1.000                               0.987    0.987
##     x2b               1.258    0.037   33.807    0.000    0.987    0.987
##     x3b               1.025    0.062   16.612    0.000    0.906    0.906
##     x4b               0.784    0.054   14.640    0.000    0.871    0.871
##     x5b               1.256    0.064   19.471    0.000    0.903    0.903
##     x6b               0.991    0.047   21.132    0.000    0.986    0.986
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   f1 ~~                                                                 
##     f2         (m)    0.301    0.013   24.014    0.000    0.630    0.630
##     bf1               0.000                               0.000    0.000
##   f2 ~~                                                                 
##     bf1               0.000                               0.000    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .x1         (e)    0.137    0.008   17.000    0.000    0.137    0.047
##    .x2         (f)    0.176    0.010   18.328    0.000    0.176    0.040
##    .x3         (g)    0.704    0.019   37.860    0.000    0.704    0.175
##    .x4         (h)    0.781    0.019   41.160    0.000    0.781    0.281
##    .x5         (i)    0.306    0.012   25.779    0.000    0.306    0.058
##    .x6         (j)    0.270    0.014   19.493    0.000    0.270    0.081
##     f1         (k)    0.481    0.016   30.196    0.000    1.000    1.000
##     f2         (l)    0.473    0.024   20.059    0.000    1.000    1.000
##     x1b               0.004    0.002    1.742    0.082    0.027    0.027
##     x2b               0.006    0.003    1.812    0.070    0.025    0.025
##     x3b               0.033    0.008    4.161    0.000    0.179    0.179
##     x4b               0.028    0.007    3.814    0.000    0.241    0.241
##     x5b               0.052    0.008    6.639    0.000    0.185    0.185
##     x6b               0.004    0.004    0.985    0.325    0.027    0.027
##     bf1               0.145    0.017    8.637    0.000    1.000    1.000

2. Automatic Setup of an MLCFA

To run this automatically using lavaan, the setup is now much simpler:

twolevel <- '
level: 1
  f1 =~ x1 + x2 + x4
  f2 =~ x3 + x5 + x6
level: 2
  fb =~ x1 + x2 + x3 + x4 + x5 + x6
'

results <- cfa(twolevel, data = raw, cluster = 'sid')
summary(results, fit.measures = T, standardized = T)

## lavaan 0.6-3 ended normally after 73 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                         31
## 
##   Number of observations                          3894
##   Number of clusters [sid]                         254
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                     140.120
##   Degrees of freedom                                17
##   P-value (Chi-square)                           0.000
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic            10848.717
##   Degrees of freedom                                30
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.989
##   Tucker-Lewis Index (TLI)                       0.980
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)             -27492.948
##   Loglikelihood unrestricted model (H1)     -27422.888
## 
##   Number of free parameters                         31
##   Akaike (AIC)                               55047.896
##   Bayesian (BIC)                             55242.179
##   Sample-size adjusted Bayesian (BIC)        55143.675
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.043
##   90 Percent Confidence Interval          0.037  0.050
##   P-value RMSEA <= 0.05                          0.953
## 
## Standardized Root Mean Square Residual (corr metric):
## 
##   SRMR (within covariance matrix)                0.022
##   SRMR (between covariance matrix)               0.055
## 
## Parameter Estimates:
## 
##   Information                                 Observed
##   Observed information based on                Hessian
##   Standard Errors                             Standard
## 
## 
## Level 1 [within]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   f1 =~                                                                 
##     x1                1.000                               0.693    0.881
##     x2                1.084    0.021   52.810    0.000    0.751    0.874
##     x4                0.644    0.024   27.003    0.000    0.446    0.449
##   f2 =~                                                                 
##     x3                1.000                               0.688    0.634
##     x5                1.146    0.030   38.464    0.000    0.789    0.819
##     x6                1.321    0.034   38.847    0.000    0.908    0.869
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   f1 ~~                                                                 
##     f2                0.301    0.012   24.171    0.000    0.631    0.631
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .x1                0.000                               0.000    0.000
##    .x2                0.000                               0.000    0.000
##    .x4                0.000                               0.000    0.000
##    .x3                0.000                               0.000    0.000
##    .x5                0.000                               0.000    0.000
##    .x6                0.000                               0.000    0.000
##     f1                0.000                               0.000    0.000
##     f2                0.000                               0.000    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .x1                0.138    0.008   17.530    0.000    0.138    0.223
##    .x2                0.175    0.009   18.669    0.000    0.175    0.237
##    .x4                0.790    0.019   41.008    0.000    0.790    0.799
##    .x3                0.705    0.019   38.015    0.000    0.705    0.598
##    .x5                0.306    0.012   25.811    0.000    0.306    0.329
##    .x6                0.269    0.014   19.432    0.000    0.269    0.246
##     f1                0.481    0.016   30.197    0.000    1.000    1.000
##     f2                0.473    0.024   19.969    0.000    1.000    1.000
## 
## 
## Level 2 [sid]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   fb =~                                                                 
##     x1                1.000                               0.385    0.988
##     x2                1.262    0.037   34.133    0.000    0.486    0.986
##     x3                0.997    0.067   14.861    0.000    0.384    0.899
##     x4                0.790    0.054   14.611    0.000    0.304    0.911
##     x5                1.209    0.071   17.078    0.000    0.466    0.898
##     x6                0.987    0.050   19.704    0.000    0.380    0.988
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .x1                4.744    0.028  168.611    0.000    4.744   12.167
##    .x2                4.564    0.035  131.910    0.000    4.564    9.253
##    .x3                3.513    0.033  107.212    0.000    3.513    8.225
##    .x4                4.348    0.027  160.857    0.000    4.348   13.024
##    .x5                4.198    0.037  114.219    0.000    4.198    8.095
##    .x6                3.913    0.030  130.439    0.000    3.913   10.168
##     fb                0.000                               0.000    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .x1                0.004    0.002    1.824    0.068    0.004    0.024
##    .x2                0.007    0.003    2.271    0.023    0.007    0.028
##    .x3                0.035    0.008    4.100    0.000    0.035    0.191
##    .x4                0.019    0.008    2.495    0.013    0.019    0.170
##    .x5                0.052    0.008    6.554    0.000    0.052    0.193
##    .x6                0.003    0.004    0.946    0.344    0.003    0.023
##     fb                0.148    0.018    8.274    0.000    1.000    1.000

The factor loadings and fit indices are similar– and are exactly the same when done using Mplus (see article p. 15 for a comparison).

To get the alpha at either level (part of the function), can use the alpha function (I prefer Omega nowadays though for several reasons).

alpha(x$ab.cov) #multilevel alpha or level 2 alpha

## [1] 0.9658581

NOTE: At the moment, you cannot use ordinal data in a two level model using lavaan and WLSMV.

END ###