Extracting the V matrix for MLMs

Notes to self (and anyone else who might find this useful). With the general linear mixed models (to simplify, I am just omitting super/subscripts):

\[Y = X\beta + Zu + e\] where we assume \(u \sim MVN(0, G)\) and \(e \sim MVN(0, R)\). \(V\) is:

\[V = ZGZ^T + R\]

Software estimates \(V\) iteratively and maximizes the likelihood (the function of which depends on whether ML or REML is used). The value of \(\beta\) can be obtained using:

\[\hat{\beta} = (X^TV^{-1}X)^{-1}X^TV^{-1}y\].

I just wanted to figure out how to extract \(V\) from the R output.

1. Manual method

Just using the sleepstudy dataset:

library(lme4)
data(sleepstudy)
nrow(sleepstudy) #total number of observations
[1] 180
dplyr::n_distinct(sleepstudy$Subject) #number of clusters
[1] 18
m1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
summary(m1)
Linear mixed model fit by REML ['lmerMod']
Formula: Reaction ~ Days + (Days | Subject)
   Data: sleepstudy

REML criterion at convergence: 1743.6

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.9536 -0.4634  0.0231  0.4634  5.1793 

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Subject  (Intercept) 612.10   24.741       
          Days         35.07    5.922   0.07
 Residual             654.94   25.592       
Number of obs: 180, groups:  Subject, 18

Fixed effects:
            Estimate Std. Error t value
(Intercept)  251.405      6.825  36.838
Days          10.467      1.546   6.771

Correlation of Fixed Effects:
     (Intr)
Days -0.138

To obtain the \(V\) matrix:

Z <- getME(m1, 'Z') #180 x 36
G <- bdiag(VarCorr(m1)) #2 x 2
G #the vcov matrix for RE
## 2 x 2 sparse Matrix of class "dgCMatrix"
##             (Intercept)      Days
## (Intercept)  612.100158  9.604409
## Days           9.604409 35.071714
R <- diag(sigma(m1)^2, nobs(m1)) #180 x 180

\(G\) is a 2 \(\times\) 2 matrix. Need to expand this based on the number of cluster/subjects.

Gm <- kronecker(diag(18), G)
Gm[1:6, 1:6] #just to show this
## 6 x 6 sparse Matrix of class "dgTMatrix"
##                                                                    
## [1,] 612.100158  9.604409   .         .          .         .       
## [2,]   9.604409 35.071714   .         .          .         .       
## [3,]   .         .        612.100158  9.604409   .         .       
## [4,]   .         .          9.604409 35.071714   .         .       
## [5,]   .         .          .         .        612.100158  9.604409
## [6,]   .         .          .         .          9.604409 35.071714
V <- Z %*% Gm %*% t(Z) + R

Knowing \(V\), we can put this together to get the fixed effects:

X <- model.matrix(m1)
y <- m1@resp$y

I will just make a function to put this together:

getFE <- function(X, V, y){
  solve(t(X) %*% solve(V) %*% X) %*% t(X) %*% solve(V) %*% y
}
getFE(X, V, y) 
## 2 x 1 Matrix of class "dgeMatrix"
##           [,1]
## [1,] 251.40510
## [2,]  10.46729
fixef(m1) #the same
## (Intercept)        Days 
##   251.40510    10.46729

Doing this way is fine for simple, well arranged (sorted) datasets but can get more complicated with more levels of clustering (three or more levels). I had to create the Gm matrix.

2. More automated way:

After some Googling, I found this which shows how to make a more general function to construct that \(V\) matrix. I just put this into a function which will make it easier to call and should work with more complicated RE specifications:

getV <- function(x){
  var.d <- crossprod(getME(x, "Lambdat"))
  Zt <- getME(x, "Zt")
  vr <- sigma(x)^2
  var.b <- vr * (t(Zt) %*% var.d %*% Zt)
  sI <- vr * Matrix::Diagonal(nobs(x)) #for a sparse matrix
  var.y <- var.b + sI
}

Just using the earlier example:

getFE(X, getV(m1), y) #works
## 2 x 1 Matrix of class "dgeMatrix"
##           [,1]
## [1,] 251.40510
## [2,]  10.46729

3. Try it with simple three level data…

Just make a small dataset that is manageable (always helpful to be able to see the data). Five students with 3 teachers in 10 sch (or 15 students per school)

n1 <- 5; n2 = 3; n3 = 10
sch <- letters[1:10] #schoolid
tch <- 1:30 #teacherid
sch.cl <- rep(sch, each = n1 * n2) #cluster var
tch.cl <- rep(tch, each = n1)
set.seed(123)
e3 <- rep(rnorm(n1), each = n1 * n2)
e2 <- rep(rnorm(n2 * n1), each = n1)
x1 <- rnorm(n1 * n2 * n3)
e1 <- rnorm(n1 * n2 * n3)
y <- e3 + e2 + .5 * x1 + e1
dat <- data.frame(y, sch.cl, tch.cl, e3, e2, x1, e1)
dat[1:30, ]
             y sch.cl tch.cl         e3         e2          x1          e1
1   0.40529698      a      1 -0.5604756  1.7150650 -1.06782371 -0.21538051
2   1.11089492      a      1 -0.5604756  1.7150650 -0.21797491  0.06529303
3   0.60751986      a      1 -0.5604756  1.7150650 -1.02600445 -0.03406725
4   2.91859562      a      1 -0.5604756  1.7150650 -0.72889123  2.12845190
5   0.10073361      a      1 -0.5604756  1.7150650 -0.62503927 -0.74133610
6  -2.03890236      a      2 -0.5604756  0.4609162 -1.68669331 -1.09599627
7   0.35712248      a      2 -0.5604756  0.4609162  0.83778704  0.03778840
8   0.28760787      a      2 -0.5604756  0.4609162  0.15337312  0.31048075
9  -0.23210443      a      2 -0.5604756  0.4609162 -1.13813694  0.43652348
10  0.06898269      a      2 -0.5604756  0.4609162  1.25381492 -0.45836533
11 -2.67563090      a      3 -0.5604756 -1.2650612  0.42646422 -1.06332613
12 -0.70988745      a      3 -0.5604756 -1.2650612 -0.29507148  1.26318518
13 -1.72762444      a      3 -0.5604756 -1.2650612  0.89512566 -0.34965039
14 -2.25198300      a      3 -0.5604756 -1.2650612  0.87813349 -0.86551286
15 -1.65102591      a      3 -0.5604756 -1.2650612  0.82158108 -0.23627957
16 -0.76988611      b      4 -0.2301775 -0.6868529  0.68864025 -0.19717589
17  0.46984878      b      4 -0.2301775 -0.6868529  0.55391765  1.10992029
18 -0.86324890      b      4 -0.2301775 -0.6868529 -0.06191171  0.08473729
19 -0.31595789      b      4 -0.2301775 -0.6868529 -0.30596266  0.75405379
20 -1.60655786      b      4 -0.2301775 -0.6868529 -0.38047100 -0.49929202
21 -0.80874764      b      5 -0.2301775 -0.4456620 -0.69470698  0.21444531
22 -1.10448401      b      5 -0.2301775 -0.4456620 -0.20791728 -0.32468591
23 -1.21395411      b      5 -0.2301775 -0.4456620 -1.26539635  0.09458353
24 -0.48672483      b      5 -0.2301775 -0.4456620  2.16895597 -0.89536336
25 -1.38265999      b      5 -0.2301775 -0.4456620  1.20796200 -1.31080153
26  2.42956340      b      6 -0.2301775  1.2240818 -1.12310858  1.99721338
27  1.39317071      b      6 -0.2301775  1.2240818 -0.40288484  0.60070882
28 -0.49069473      b      6 -0.2301775  1.2240818 -0.46665535 -1.25127136
29  0.77272095      b      6 -0.2301775  1.2240818  0.77996512 -0.61116592
30 -0.23326031      b      6 -0.2301775  1.2240818 -0.08336907 -1.18548008
m2 <- lmer(y ~ x1 + (1|sch.cl/tch.cl), data = dat)
summary(m2)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x1 + (1 | sch.cl/tch.cl)
   Data: dat

REML criterion at convergence: 490.1

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.54947 -0.56040 -0.07822  0.52962  2.09596 

Random effects:
 Groups        Name        Variance Std.Dev.
 tch.cl:sch.cl (Intercept) 1.2265   1.1075  
 sch.cl        (Intercept) 0.4201   0.6482  
 Residual                  0.9812   0.9905  
Number of obs: 150, groups:  tch.cl:sch.cl, 30; sch.cl, 10

Fixed effects:
            Estimate Std. Error t value
(Intercept)  0.37589    0.29907   1.257
x1           0.35875    0.09218   3.892

Correlation of Fixed Effects:
   (Intr)
x1 0.005 
Z <- getME(m2, "Z") #n x (sch + tch; just intercepts)
dim(Z)
[1] 150  40
image(Z[1:30,]) #just showing the Z matrix for the first 30 obs

The \(Z\) design matrix is different now since there are more than two levels. This needs to be multiplied by the \(Gm\) matrix to follow:

G <- bdiag(VarCorr(m2))
G #how should G then be formed for the Gm?
2 x 2 sparse Matrix of class "dgCMatrix"
                       
[1,] 1.226473 .        
[2,] .        0.4201496

Intercept variance on upper left is for teachers and lower right is for schools. Constructing the matrix is relatively straightforward. Doing this manually.

a1 <- kronecker(diag(30), 1.226473) #for teachers
a2 <- kronecker(diag(10), .4201496) #for schools
Gm <- bdiag(a1, a2) #stacking as a diagonal matrix
dim(Gm)
## [1] 40 40
image(Gm)

Can then use this to create the \(V\) matrix:

V <- Z %*% Gm %*% t(Z) + diag(sigma(m2)^2, nobs(m2))
X <- model.matrix(m2)
y <- m2@resp$y
getFE(X, V, y) #compute our fixed effects
## 2 x 1 Matrix of class "dgeMatrix"
##           [,1]
## [1,] 0.3758945
## [2,] 0.3587507
fixef(m2) #the same
## (Intercept)          x1 
##   0.3758945   0.3587507

If we use our getV function, this just works too which much less hassle!

getFE(X, getV(m2), y)
## 2 x 1 Matrix of class "dgeMatrix"
##           [,1]
## [1,] 0.3758945
## [2,] 0.3587507

4. Throw in random slopes

The above will only work if the structure of \(Z\) and \(G\) are simple. If data are not sorted, that may not work.

library(mlmRev)
library(dplyr)
data(star)
set.seed(123)
konly <- filter(star, gr == 'K') %>% sample_frac(.2) #just selecting a few
#so the Z matrix can be seen
m3 <- lmer(read ~ cltype + sx + (sx|sch/tch), data = konly)
X <- model.matrix(m3)
y <- m3@resp$y
Vm <- getV(m3)
getFE(X, Vm, y)
4 x 1 Matrix of class "dgeMatrix"
           [,1]
[1,] 438.908738
[2,]  -5.923652
[3,]  -2.209975
[4,]   2.045771
fixef(m3) #the same!
(Intercept)   cltypereg cltypereg+A         sxF 
 438.908738   -5.923652   -2.209975    2.045771 

Look at the \(G\) matrix:

bdiag(VarCorr(m3))
## 4 x 4 sparse Matrix of class "dgCMatrix"
##                                                 
## [1,] 141.78835 -10.1180105   .         .        
## [2,] -10.11801   0.7220208   .         .        
## [3,]   .         .         153.929734 -6.3991477
## [4,]   .         .          -6.399148  0.2660246

The Z matrix is a mess!

Z <- getME(m3, 'Z')
image(Z)

You can compute the \(V\) matrix per cluster but in this example, just wanted to get the overall matrix.

–END

Next
Previous
comments powered by Disqus