In our module on regression diagnostics, I mentioned 1) that at times (with clustered data) standard errors may be misestimated and may be too low, resulting in a greater chance of making a Type I error (i.e., claiming statistically significant results when they should not be). In our ANCOVA session, I also indicated that 2) covariates are helpful because they help to lower the (standard) error in the model and increase power.
Researchers may want to simulate a two-level model (i.e., a hierarchical linear model, a random effects model, etc.). The following code illustrates how to generate the data and compares analytic techniques using MLM and OLS.
1. Simulate the data set.seed(1234) #for reproducability nG <- 20 #number of groups nJ <- 30 #cluster size W1 <- 2 #level 2 coeff X1 <- 3 #level 1 coeff tmp2 <- rnorm(nG) #generate 20 random numbers, m = 0, sd = 1 l2 <- rep(tmp2, each = nJ) #all units in l2 have the same value group <- gl(nG, k = nJ) #creating cluster variable tmp2 <- rnorm(nG) #error term for level 2 err2 <- rep(tmp2, each = nJ) #all units in l2 have the same value l1 <- rnorm(nG * nJ) #total sample size is nG * nJ err1 <- rnorm(nG * nJ) #level 1 #putting it all together y <- W1 * l2 + X1 * l1 + err2 + err1 dat <- data.