Which weights to use with multilevel models?
A common question with the use of large-scale assessments (LSAs) is related to the use of weights. Another issue is how to specify these weights properly.
Software such as SAS and Mplus, when specifying weights at two levels, require the use of conditional weights at level 1 if the level-2 weight is specified (or you can just use the level-2 weights alone; see Mang et al., 2021, see bottom part of this post).
To illustrate this using R (with the mixPV
function):
1. Load in the needed functions and data
First, load in the mixPV
function. Load in the pisa2012
dataset as well.
library(modelsummary)
source("https://raw.githubusercontent.com/flh3/pubdata/main/mixPV/mixPVv2.R")
data(pisa2012, package = 'MLMusingR')
By default, the WeMix
package uses conditional weights = FALSE or cWeights = FALSE
(no need to specify). This makes it easier so users don’t have to compute the conditional weights (conditional on the school being selected) and can just use the total weight supplied in many datasets. You have to make sure which type of weight you are using. NOTE: the mc = TRUE
option is used to shorten computing time (turning on the use of multiple cores). The following just uses the raw weights at both levels:
m1a <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~
sc14q02 + st04q01 + escs + (1|schoolid),
weights = c('w_fstuwt', 'w_fschwt'), mc = TRUE,
data = pisa2012)
Attempting to use 15 cores. Progress will not be displayed. Please wait.
However, if you use software such as Mplus or SAS, you need to specify the conditional weights. To compute the conditional weights (if you are specifying weights at both levels), it is the total weight divided by the school weight.
pisa2012$cweight <- pisa2012$w_fstuwt / pisa2012$w_fschwt
We can then use this when we fit the model and we include cWeights = TRUE
. We can compare the output in a while.
m1b <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~
sc14q02 + st04q01 + escs + (1|schoolid),
weights = c('cweight', 'w_fschwt'), mc = TRUE,
cWeight = TRUE,
data = pisa2012)
Attempting to use 15 cores. Progress will not be displayed. Please wait.
I will include the example if we just use the conditional weight (as you would with other software packages) but DO NOT turn on the cWeights
option.
m1c <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~
sc14q02 + st04q01 + escs + (1|schoolid),
weights = c('cweight', 'w_fschwt'), mc = TRUE,
#cWeight = TRUE, ## for illustrative purposes only
data = pisa2012)
Attempting to use 15 cores. Progress will not be displayed. Please wait.
We can compare results side-by-side.
modelsummary(list("default" = m1a, "cweights" = m1b, "wrong" = m1c),
stars = TRUE, gof_map = NA)
default | cweights | wrong | |
---|---|---|---|
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001 | |||
(Intercept) | 470.498*** | 470.498*** | 479.535*** |
(7.103) | (7.103) | (4.876) | |
sc14q02Very little | -20.910 | -20.910 | -8.561 |
(15.710) | (15.710) | (6.421) | |
sc14q02To some extent | -16.131 | -16.131 | -27.209** |
(11.823) | (11.823) | (8.611) | |
sc14q02A lot | -25.224** | -25.224** | -23.765*** |
(7.614) | (7.614) | (6.635) | |
st04q01Male | 9.529** | 9.529** | 10.784** |
(3.097) | (3.097) | (4.163) | |
escs | 26.300*** | 26.300*** | 35.847*** |
(2.210) | (2.210) | (2.158) | |
schoolid.(Intercept) | 1344.166*** | 1344.166*** | 0.000 |
(300.241) | (300.241) | (0.000) | |
Residual | 5418.299*** | 5418.299*** | 6701.321*** |
(160.102) | (160.102) | (210.501) |
The first two models have identical results. The third model (which is incorrectly specified) results differ and the level-2 variance here is zero.
2. What about scaling level-1 weights?
Asparouhov (2005) suggested scaling the weights at level 1. The following formulas (and explanations) can be seen at the Mplus website at https://statmodel.com/download/Scaling3.pdf.
The unscaled weight is \(w^*_{ij}=w_{ij}\).
What Mplus calls the cluster weight has scaled weights that add up to the cluster sample size. Carle (2009) refers to this as Method A. It is computed using:
\(w^*_{ij}=w_{ij}\frac{n_j}{\Sigma_i w_{ij}}\)
Here, \(n_j\) is the number of sample units in cluster \(j\). This is the default scaling method in Mplus and has generally been shown to perform well (see Asparouhov, 2006).
Another scaling approach is referred to as the ecluster method which is the effective cluster sample size method. This is referred to as Method B by Carle (2009) (though the formula in his appendix seems to differ slightly):
This is similar to the cluster method except that the numerator is \(n^*_j\) instead and is:
\(w^*_{ij}=w_{ij}\frac{n^*_j}{\Sigma_i w_{ij}}\).
The numerator is computed using:
\(n^*_{j}=\frac{(\Sigma_i w_{ij})^2}{\Sigma_i w^2_{ij}}\).
To compute the scaled weights, we can use the wscale
function (note the options). The cluster variable, dataset, conditional cluster weight,
and type of scaling need to be specified:
pisa2012$cl_weight <- wscale(cluster = 'schoolid', data = pisa2012,
wt = 'cweight', type = 'cluster')
pisa2012$es_weight <- wscale(cluster = 'schoolid', data = pisa2012,
wt = 'cweight', type = 'ecluster')
After I wrote the function above, I found a function that does the scaling. The results are the same:
xx <- datawizard::rescale_weights(pisa2012, "schoolid", "w_fstuwt")
We can compare results to those of Mplus. This is using cluster weights:
ex <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~
1 + (1|schoolid),
weights = c('cl_weight', 'w_fschwt'), mc = TRUE,
cWeight = TRUE,
data = pisa2012)
Attempting to use 15 cores. Progress will not be displayed. Please wait.
summary(ex)
Results of multilevel analyses with 5 plausible values.
Number of observations: 3136
Estimates for random effects:
estimate std.error statistic df Pr(>t)
schoolid.(Intercept) 1825.03 436.26 4.18 2719.0 <2e-16 ***
Residual 5726.58 233.29 24.55 90.6 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Estimates for fixed effects:
estimate std.error statistic df Pr(>t)
(Intercept) 473.39 7.03 67.33 1777 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Comparing to Mplus output shows similar results (SEs might differ slightly):
Estimate S.E. Est./S.E. P-Value
Within Level
Variances
MATH 5726.580 234.000 24.473 0.000
Between Level
Means
MATH 473.390 7.387 64.088 0.000
Variances
MATH 1825.021 464.365 3.930 0.000
This is using the ecluster option:
ex2 <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~
1 + (1|schoolid),
weights = c('es_weight', 'w_fschwt'), mc = TRUE,
cWeight = TRUE,
data = pisa2012)
Attempting to use 15 cores. Progress will not be displayed. Please wait.
summary(ex2)
Results of multilevel analyses with 5 plausible values.
Number of observations: 3136
Estimates for random effects:
estimate std.error statistic df Pr(>t)
schoolid.(Intercept) 1821.40 434.62 4.19 2714.0 <2e-16 ***
Residual 5727.56 233.36 24.54 90.2 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Estimates for fixed effects:
estimate std.error statistic df Pr(>t)
(Intercept) 473.40 7.02 67.39 1776 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Estimate S.E. Est./S.E. P-Value
Within Level
Variances
MATH 5727.560 234.056 24.471 0.000
Between Level
Means
MATH 473.403 7.379 64.156 0.000
Variances
MATH 1821.389 462.665 3.937 0.000
This is unscaled:
ex3 <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~
1 + (1|schoolid),
weights = c('cweight', 'w_fschwt'), mc = TRUE,
cWeight = TRUE,
data = pisa2012)
Attempting to use 15 cores. Progress will not be displayed. Please wait.
summary(ex3)
Results of multilevel analyses with 5 plausible values.
Number of observations: 3136
Estimates for random effects:
estimate std.error statistic df Pr(>t)
schoolid.(Intercept) 2126.22 503.39 4.22 3099 <2e-16 ***
Residual 5895.85 175.05 33.68 876 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Estimates for fixed effects:
estimate std.error statistic df Pr(>t)
(Intercept) 472.35 7.33 64.44 1470 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Estimate S.E. Est./S.E. P-Value
Within Level
Variances
MATH 5895.849 174.758 33.737 0.000
Between Level
Means
MATH 472.349 7.750 60.950 0.000
Variances
MATH 2126.477 535.991 3.967 0.000
Results differ based on the scaling option and results are all similar to those generated by Mplus.
3. What about just using level-2 weights?
A simpler option may just be to use level-2 weights (as suggested by Mang et al., 2021 as well as our own simulation results [under review]). To do so, when using R, we need to need to set the weight at level 1 to 1 and use conditional weights.
pisa2012$one <- 1 #level-1 weight
m1d <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~
sc14q02 + st04q01 + escs + (1|schoolid),
weights = c('one', 'w_fschwt'), mc = T,
cWeight = TRUE,
data = pisa2012)
Attempting to use 15 cores. Progress will not be displayed. Please wait.
For comparison, I will show results also using the scaled weights:
m1e <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~
sc14q02 + st04q01 + escs + (1|schoolid),
weights = c('cl_weight', 'w_fschwt'), mc = TRUE,
cWeight = TRUE,
data = pisa2012)
Attempting to use 15 cores. Progress will not be displayed. Please wait.
m1f<- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~
sc14q02 + st04q01 + escs + (1|schoolid),
weights = c('es_weight', 'w_fschwt'), mc = TRUE,
cWeight = TRUE,
data = pisa2012)
Attempting to use 15 cores. Progress will not be displayed. Please wait.
modelsummary(list("l2only" = m1d, 'cluster' = m1e, "ecluster" = m1f),
stars = TRUE, gof_map = NA)
l2only | cluster | ecluster | |
---|---|---|---|
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001 | |||
(Intercept) | 469.276*** | 469.241*** | 469.233*** |
(7.049) | (7.088) | (7.082) | |
sc14q02Very little | -18.928 | -18.795 | -18.800 |
(13.537) | (13.624) | (13.614) | |
sc14q02To some extent | -14.990 | -15.337 | -15.339 |
(11.902) | (12.117) | (12.111) | |
sc14q02A lot | -23.536** | -23.265** | -23.273** |
(7.952) | (8.088) | (8.083) | |
st04q01Male | 12.296** | 12.051** | 12.084** |
(4.320) | (4.345) | (4.349) | |
escs | 29.086*** | 29.045*** | 29.049*** |
(2.698) | (2.685) | (2.688) | |
schoolid.(Intercept) | 1054.672*** | 1069.977*** | 1067.115*** |
(258.981) | (260.180) | (259.201) | |
Residual | 5223.750*** | 5219.380*** | 5220.383*** |
(206.297) | (204.390) | (204.518) |
Results are very similar.
A reason for this is that the level-1 scaled weights are all very close to 1 (or even equal to 1). If there is no variation within cluster, the weights will be 1.
If we examine the original conditional weights, there is quite some variability:
range(pisa2012$cweight) #the original conditional weight
[1] 1.228571 37.018006
However, compared to the scaled weights, they are all very close to 1. Which is why the results shown above are all very similar.
range(pisa2012$es_weight)
[1] 0.7635826 1.3325292
range(pisa2012$cl_weight)
[1] 0.7820465 1.3559951
mean(pisa2012$es_weight)
[1] 0.9962017
mean(pisa2012$cl_weight)
[1] 1
Takeaway: when running multilevel models with weights at different levels, make sure you are specifying the correct weights. If the residual variance at the second level is close to zero, you should probably revisit your syntax.
References
Asparouhov, T. (2006). General Multi-Level Modeling with Sampling Weights Communications in Statistics - Theory and Methods, 35: 439-460.
Carle, A. (2009). Fitting multilevel models in complex survey data with design weights: Recommendations BMC Medical Research Methodology, 9(49): 1-13.
Mang, J., Küchenhoff, H., Meinck, S., & Prenzel, M. (2021). Sampling weights in multilevel modelling: An investigation using PISA sampling structures. Large-Scale Assessments in Education, 9(1), 6. https://doi.org/10.1186/s40536-021-00099-0
END