Which weights to use with multilevel models?

A common question with the use of large-scale assessments (LSAs) is related to the use of weights. Another issue is how to specify these weights properly.

Software such as SAS and Mplus, when specifying weights at two levels, require the use of conditional weights at level 1 if the level-2 weight is specified (or you can just use the level-2 weights alone; see Mang et al., 2021, see bottom part of this post).

To illustrate this using R (with the mixPV function):

1. Load in the needed functions and data

First, load in the mixPV function. Load in the pisa2012 dataset as well.

library(modelsummary)
source("https://raw.githubusercontent.com/flh3/pubdata/main/mixPV/mixPVv2.R")
data(pisa2012, package = 'MLMusingR')

By default, the WeMix package uses conditional weights = FALSE or cWeights = FALSE (no need to specify). This makes it easier so users don’t have to compute the conditional weights (conditional on the school being selected) and can just use the total weight supplied in many datasets. You have to make sure which type of weight you are using. NOTE: the mc = TRUE option is used to shorten computing time (turning on the use of multiple cores). The following just uses the raw weights at both levels:

m1a <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~ 
               sc14q02 + st04q01 + escs + (1|schoolid), 
             weights = c('w_fstuwt', 'w_fschwt'), mc = TRUE,
             data = pisa2012)

Attempting to use 15 cores. Progress will not be displayed. Please wait.

However, if you use software such as Mplus or SAS, you need to specify the conditional weights. To compute the conditional weights (if you are specifying weights at both levels), it is the total weight divided by the school weight.

pisa2012$cweight <- pisa2012$w_fstuwt / pisa2012$w_fschwt

We can then use this when we fit the model and we include cWeights = TRUE. We can compare the output in a while.

m1b <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~ 
               sc14q02 + st04q01 + escs + (1|schoolid), 
             weights = c('cweight', 'w_fschwt'), mc = TRUE,
             cWeight = TRUE,
             data = pisa2012)

Attempting to use 15 cores. Progress will not be displayed. Please wait.

I will include the example if we just use the conditional weight (as you would with other software packages) but DO NOT turn on the cWeights option.

m1c <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~ 
               sc14q02 + st04q01 + escs + (1|schoolid), 
             weights = c('cweight', 'w_fschwt'), mc = TRUE,
             #cWeight = TRUE,  ## for illustrative purposes only
             data = pisa2012)

Attempting to use 15 cores. Progress will not be displayed. Please wait.

We can compare results side-by-side.

modelsummary(list("default" = m1a, "cweights" = m1b, "wrong" = m1c),
  stars = TRUE, gof_map = NA, output = 'gt')

	default	cweights	wrong
(Intercept)	470.498***	470.498***	479.535***
	(7.103)	(7.103)	(4.876)
sc14q02Very little	-20.910	-20.910	-8.561
	(15.710)	(15.710)	(6.421)
sc14q02To some extent	-16.131	-16.131	-27.209**
	(11.823)	(11.823)	(8.611)
sc14q02A lot	-25.224**	-25.224**	-23.765***
	(7.614)	(7.614)	(6.635)
st04q01Male	9.529**	9.529**	10.784**
	(3.097)	(3.097)	(4.163)
escs	26.300***	26.300***	35.847***
	(2.210)	(2.210)	(2.158)
schoolid.(Intercept)	1344.166***	1344.166***	0.000
	(300.241)	(300.241)	(0.000)
Residual	5418.299***	5418.299***	6701.321***
	(160.102)	(160.102)	(210.501)
+ p < 0.1, * p < 0.05, p < 0.01, * p < 0.001

The first two models have identical results. The third model (which is incorrectly specified) results differ and the level-2 variance here is zero.

2. What about scaling level-1 weights?

Asparouhov (2005) suggested scaling the weights at level 1. The following formulas (and explanations) can be seen at the Mplus website at https://statmodel.com/download/Scaling3.pdf.

The unscaled weight is $w^*_{ij}=w_{ij}$ .

What Mplus calls the cluster weight has scaled weights that add up to the cluster sample size. Carle (2009) refers to this as Method A. It is computed using:

$w^*_{ij}=w_{ij}\frac{n_j}{\Sigma_i w_{ij}}$

Here, $n_j$ is the number of sample units in cluster $j$ . This is the default scaling method in Mplus and has generally been shown to perform well (see Asparouhov, 2006).

Another scaling approach is referred to as the ecluster method which is the effective cluster sample size method. This is referred to as Method B by Carle (2009) (though the formula in his appendix seems to differ slightly):

This is similar to the cluster method except that the numerator is $n^*_j$ instead and is:

$w^*_{ij}=w_{ij}\frac{n^*_j}{\Sigma_i w_{ij}}$ .

The numerator is computed using:

$n^*_{j}=\frac{(\Sigma_i w_{ij})^2}{\Sigma_i w^2_{ij}}$ .

To compute the scaled weights, we can use the wscale function (note the options). The cluster variable, dataset, conditional cluster weight, and type of scaling need to be specified:

pisa2012$cl_weight <- wscale(cluster = 'schoolid', data = pisa2012, 
                             wt = 'cweight', type = 'cluster')
pisa2012$es_weight <- wscale(cluster = 'schoolid', data = pisa2012, 
                             wt = 'cweight', type = 'ecluster')

After I wrote the function above, I found a function that does the scaling. The results are the same:

xx <- datawizard::rescale_weights(pisa2012, "schoolid", "w_fstuwt")

We can compare results to those of Mplus. This is using cluster weights:

ex <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~ 
              1 + (1|schoolid), 
            weights = c('cl_weight', 'w_fschwt'), mc = TRUE,
            cWeight = TRUE,
            data = pisa2012)

Attempting to use 15 cores. Progress will not be displayed. Please wait.

summary(ex)

Results of multilevel analyses with 5 plausible values.
Number of observations: 3136 

Estimates for random effects: 
                     estimate std.error statistic     df Pr(>t)    
schoolid.(Intercept)  1825.03    436.26      4.18 2719.0 <2e-16 ***
Residual              5726.58    233.29     24.55   90.6 <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Estimates for fixed effects: 
            estimate std.error statistic   df Pr(>t)    
(Intercept)   473.39      7.03     67.33 1777 <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Comparing to Mplus output shows similar results (SEs might differ slightly):

                Estimate       S.E.  Est./S.E.    P-Value

Within Level

Variances
MATH            5726.580    234.000     24.473      0.000

Between Level

Means
MATH             473.390      7.387     64.088      0.000

Variances
MATH            1825.021    464.365      3.930      0.000

This is using the ecluster option:

ex2 <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~ 
               1 + (1|schoolid), 
             weights = c('es_weight', 'w_fschwt'), mc = TRUE,
             cWeight = TRUE,
             data = pisa2012)

Attempting to use 15 cores. Progress will not be displayed. Please wait.

summary(ex2)

Results of multilevel analyses with 5 plausible values.
Number of observations: 3136 

Estimates for random effects: 
                     estimate std.error statistic     df Pr(>t)    
schoolid.(Intercept)  1821.40    434.62      4.19 2714.0 <2e-16 ***
Residual              5727.56    233.36     24.54   90.2 <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Estimates for fixed effects: 
            estimate std.error statistic   df Pr(>t)    
(Intercept)   473.40      7.02     67.39 1776 <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

                Estimate       S.E.  Est./S.E.    P-Value

Within Level

Variances
MATH            5727.560    234.056     24.471      0.000

Between Level

Means
MATH             473.403      7.379     64.156      0.000

Variances
MATH            1821.389    462.665      3.937      0.000

This is unscaled:

ex3 <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~ 
               1 + (1|schoolid), 
             weights = c('cweight', 'w_fschwt'), mc = TRUE,
             cWeight = TRUE,
             data = pisa2012)

Attempting to use 15 cores. Progress will not be displayed. Please wait.

summary(ex3)

Results of multilevel analyses with 5 plausible values.
Number of observations: 3136 

Estimates for random effects: 
                     estimate std.error statistic   df Pr(>t)    
schoolid.(Intercept)  2126.22    503.39      4.22 3099 <2e-16 ***
Residual              5895.85    175.05     33.68  876 <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Estimates for fixed effects: 
            estimate std.error statistic   df Pr(>t)    
(Intercept)   472.35      7.33     64.44 1470 <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

                 Estimate       S.E.  Est./S.E.    P-Value
Within Level

Variances
MATH            5895.849    174.758     33.737      0.000

Between Level

Means
MATH             472.349      7.750     60.950      0.000

Variances
MATH            2126.477    535.991      3.967      0.000

Results differ based on the scaling option and results are all similar to those generated by Mplus.

3. What about just using level-2 weights?

A simpler option may just be to use level-2 weights (as suggested by Mang et al., 2021 as well as our own simulation results [under review]). To do so, when using R, we need to need to set the weight at level 1 to 1 and use conditional weights.

pisa2012$one <- 1 #level-1 weight
m1d <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~ 
               sc14q02 + st04q01 + escs + (1|schoolid), 
             weights = c('one', 'w_fschwt'), mc = T,
             cWeight = TRUE,
             data = pisa2012)

Attempting to use 15 cores. Progress will not be displayed. Please wait.

For comparison, I will show results also using the scaled weights:

m1e <- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~ 
               sc14q02 + st04q01 + escs + (1|schoolid), 
             weights = c('cl_weight', 'w_fschwt'), mc = TRUE,
             cWeight = TRUE,
             data = pisa2012)

Attempting to use 15 cores. Progress will not be displayed. Please wait.

m1f<- mixPV(pv1math + pv2math + pv3math + pv4math + pv5math ~ 
              sc14q02 + st04q01 + escs + (1|schoolid), 
            weights = c('es_weight', 'w_fschwt'), mc = TRUE,
            cWeight = TRUE,
            data = pisa2012)

Attempting to use 15 cores. Progress will not be displayed. Please wait.

modelsummary(list("l2only" = m1d, 'cluster' = m1e, "ecluster" = m1f), 
             stars = TRUE, gof_map = NA, output = 'gt')

	l2only	cluster	ecluster
(Intercept)	469.276***	469.241***	469.233***
	(7.049)	(7.088)	(7.082)
sc14q02Very little	-18.928	-18.795	-18.800
	(13.537)	(13.624)	(13.614)
sc14q02To some extent	-14.990	-15.337	-15.339
	(11.902)	(12.117)	(12.111)
sc14q02A lot	-23.536**	-23.265**	-23.273**
	(7.952)	(8.088)	(8.083)
st04q01Male	12.296**	12.051**	12.084**
	(4.320)	(4.345)	(4.349)
escs	29.086***	29.045***	29.049***
	(2.698)	(2.685)	(2.688)
schoolid.(Intercept)	1054.672***	1069.977***	1067.115***
	(258.981)	(260.180)	(259.201)
Residual	5223.750***	5219.380***	5220.383***
	(206.297)	(204.390)	(204.518)
+ p < 0.1, * p < 0.05, p < 0.01, * p < 0.001

Results are very similar.

A reason for this is that the level-1 scaled weights are all very close to 1 (or even equal to 1). If there is no variation within cluster, the weights will be 1.

If we examine the original conditional weights, there is quite some variability:

range(pisa2012$cweight) #the original conditional weight

[1]  1.23 37.02

However, compared to the scaled weights, they are all very close to 1. Which is why the results shown above are all very similar.

range(pisa2012$es_weight)

[1] 0.764 1.333

range(pisa2012$cl_weight)

[1] 0.782 1.356

mean(pisa2012$es_weight)

[1] 0.996

mean(pisa2012$cl_weight)

[1] 1

Takeaway: when running multilevel models with weights at different levels, make sure you are specifying the correct weights. If the residual variance at the second level is close to zero, you should probably revisit your syntax.

References

Asparouhov, T. (2006). General Multi-Level Modeling with Sampling Weights Communications in Statistics - Theory and Methods, 35: 439-460.

Carle, A. (2009). Fitting multilevel models in complex survey data with design weights: Recommendations BMC Medical Research Methodology, 9(49): 1-13.

Mang, J., Kuchenhoff, H., Meinck, S., & Prenzel, M. (2021). Sampling weights in multilevel modelling: An investigation using PISA sampling structures. Large-Scale Assessments in Education, 9(1), 6. https://doi.org/10.1186/s40536-021-00099-0

END

Selecting the proper weights in LSAs with multilevel models

Which weights to use with multilevel models?

1. Load in the needed functions and data

2. What about scaling level-1 weights?

3. What about just using level-2 weights?

References