The other day in class, while talking about instances (e.g., analyzing clustered data or heteroskedastic residuals) where adjustments are required to the standard errors of a regression model, a student asked: how do we know what the ‘true’ standard error should be in the first place– which is necessary to know if it is too high or too low.
This short simulation illustrates that, over repeated sampling from a specified population, the standard deviaton of the regression coefficients can be used as the true standard errors.
Illustration showing different flavors of robust standard errors. Load in library, dataset, and recode. Do not really need to dummy code but may make making the X matrix easier. Using the High School & Beyond (hsb) dataset.
library(mlmRev) #has the hsb dataset ## Loading required package: lme4 ## Loading required package: Matrix library(summarytools) #for descriptives library(jtools) #for output library(dplyr) #for pipes and selecting ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union library(sandwich) #robust SEs library(lmtest) #for coeftest ## Loading required package: zoo ## ## Attaching package: 'zoo' ## The following objects are masked from 'package:base': ## ## as.