Recent & Upcoming Talks

2024

Replication studies have been promoted as a means to investigate the fragility or robustness of findings from prior studies. However, less well known is that replication studies can be done with nonexperimental or secondary datasets and are not just for experimental studies. I present a framework of different types of replication studies with nonexperimental or secondary data. I show that replication studies can be used as robustness checks, as a means of testing the generalizability of existing theories, and as a way of extending findings of prior studies.

Data from international large-scale assessments (ILSAs) reflect the nested structure of education systems and is, therefore, very well suited for multilevel modeling (MLM). However, because these data come from complex cluster samples, there are methodological aspects that a researcher needs to understand when doing MLM, e.g., the need for using sampling weights and multiple achievement values for parameter estimation. This course will teach participants how to do MLM with data from ILSAs, such as PIRLS, TIMSS, and PISA. The content of the course will include an overview of the ILSAs and a presentation on the design of these studies and databases and implications for MLM analysis. Participants will learn how to specify two-level models using the HLM software program and also learn about model comparison, centering decisions and their consequences, and available resources for doing three-level models. Time will be allotted for participants to work on practice exercises, with several instructors available to mentor and answer questions. Participants should have a solid understanding of OLS regression and a basic understanding of MLM. Prior experience using a statistical software program, such as Stata or SPSS, is helpful. Prior knowledge about ILSAs or prior experience using the respective databases or HLM software is not required. Rathbun, A., Huang, F., Meinck, S., Park, B., Ikoma, S., & Zhang, Y. (2023, April). Multilevel modeling with large-scale international datasets. Professional development course presented at the annual meeting of the American Educational Research Association

2023

In education, data are often clustered (e.g., students within schools) and various methods (e.g., multilevel modeling, generalized estimating equations) have been developed over the years to properly account for these nonindependent data structures. Ignoring the clustered data structure is well known to result in erroneous statistical inference tests (e.g., type I errors) due to misestimated standard errors and overly liberal degrees of freedom used. One alternative method when analyzing clustered datasets is to use cluster-robust standard errors (CRSEs; CR0) (Liang & Zeger, 1986). CRSEs are often used in various disciplines (e.g., econometrics) though are not common in educational research. A limitation of CRSEs is that, although they work well with a large number of clusters, CRSEs are known to still underestimate standard errors when there are a limited number of clusters (e.g., < 50). This is of particular importance when analyzing data from cluster randomized controlled trials (CRTs) where often, a limited number of clusters is common. However, over 20 years ago, Bell and McCaffrey (2002) proposed an adjustment to the traditional CRSEs and referred to this as the bias-reduced linearization (or the CR2) estimator used together with Satterthwaite (1946) degrees of freedom (df) adjustments. However, the CR2 has not seen much use in the applied literature due to its limited accessibility. Using Monte Carlo simulations (using R), we evaluated the CR2 estimator using conditions often found in educational research using both continuous and binary outcomes (as well as cross classified data structures). Conditions based on the number of clusters, the intraclass correlation coefficient, and group size (among others) were manipulated. Coverage probabilities, type I error rates, and power were assessed. The CR2 estimator results (with and without df adjustments) were compared to results analyzed using the traditional CR0 CRSEs and multilevel models (MLMs). Findings show that the traditional CRSEs (i.e., CR0) had issues with a few clusters but the CR2 results were comparable to those estimated using multilevel models and are a viable alternative when only a few clusters are present. To extend its use for applied researchers, we also provide a free SPSS add-on that can compute these CRSEs.

Multilevel data can have observations nested within two dimensions of clustering which do not follow a pure nested structure. Failure to consider both dimensions simultaneously may lead to biased results. To address this, cross-classified random effects models (CCREMs) have been developed (Goldstein, 1987) to capture the random effects from multiple dimensions. Although effective, CCREMs may encounter nonconvergence issues. Alternatively, a linear regression model with cluster robust standard errors (OLS-CRSEs) (Liang & Zeger, 1986) can provide asymptotically consistent standard errors. Cameron et al. (2011) introduced the two-way clustering case which is equivalent to the cross-classified data. However, there is a lack of a comprehensive comparison for multilevel data with two dimensions of clustering with a small number of clusters.

Data from international large-scale assessments (ILSAs) reflect the nested structure of education systems and is, therefore, very well suited for multilevel modeling (MLM). However, because these data come from complex cluster samples, there are methodological aspects that a researcher needs to understand when doing MLM, e.g., the need for using sampling weights and multiple achievement values for parameter estimation. This course will teach participants how to do MLM with data from ILSAs, such as PIRLS, TIMSS, and PISA. The content of the course will include an overview of the ILSAs and a presentation on the design of these studies and databases and implications for MLM analysis. Participants will learn how to specify two-level models using the HLM software program and also learn about model comparison, centering decisions and their consequences, and available resources for doing three-level models. Time will be allotted for participants to work on practice exercises, with several instructors available to mentor and answer questions. Participants should have a solid understanding of OLS regression and a basic understanding of MLM. Prior experience using a statistical software program, such as Stata or SPSS, is helpful. Prior knowledge about ILSAs or prior experience using the respective databases or HLM software is not required. Rathbun, A., Huang, F., Meinck, S., Park, B., Ikoma, S., & Zhang, Y. (2023, April). Multilevel modeling with large-scale international datasets. Professional development course presented at the annual meeting of the American Educational Research Association

2021

Binary outcomes are often encountered when analyzing cluster randomized trials (CRTs). A common approach to obtaining the average treatment effect of an intervention may involve using a logistic regression model. We outline some interpretive and statistical challenges associated with using logistic regression and discuss two alternative/supplementary approaches for analyzing clustered data with binary outcomes: the linear probability model (LPM) and the modified Poisson regression model. In our simulation and applied example, all models use a standard error adjustment that is effective even if a low number of clusters is present. Simulation results show that both the LPM and modified Poisson regression models can provide unbiased point estimates with acceptable coverage and type I error rates even with as little as 20 clusters. Society for Prevention Research (SPR)

Data from international large-scale assessments (ILSAs) reflect the nested structure of education systems and is, therefore, very well suited for multilevel modeling (MLM). However, because these data come from complex cluster samples, there are methodological aspects that a researcher needs to understand when doing MLM, e.g., the need for using sampling weights and multiple achievement values for parameter estimation. This course will teach participants how to do MLM with data from ILSAs, such as PIRLS, TIMSS, and PISA. The content of the course will include an overview of the ILSAs and a presentation on the design of these studies and databases and implications for MLM analysis. Participants will learn how to specify two-level models using the HLM software program and also learn about model comparison, centering decisions and their consequences, and available resources for doing three-level models. Time will be allotted for participants to work on practice exercises, with several instructors available to mentor and answer questions. Participants should have a solid understanding of OLS regression and a basic understanding of MLM. Prior experience using a statistical software program, such as Stata or SPSS, is helpful. Prior knowledge about ILSAs or prior experience using the respective databases or HLM software is not required. Rathbun, A., Huang, F., Meinck, S., Park, B., Ikoma, S., & Zhang, Y. (2021, April). Multilevel modeling with large-scale international datasets. Professional development course presented at the annual meeting of the American Educational Research Association

2020

Preconfence Workshop: Data from international large-scale assessments (ILSAs) reflect the nested structure of education systems and is, therefore, very well suited for multilevel modeling (MLM). However, because these data come from complex cluster samples, there are methodological aspects that a researcher needs to understand when doing MLM, e.g., the need for using sampling weights and multiple achievement values for parameter estimation. This course will teach participants how to do MLM with data from ILSAs, such as PIRLS, TIMSS, and PISA. The content of the course will include an overview of the ILSAs and a presentation on the design of these studies and databases and implications for MLM analysis. Participants will learn how to specify two-level models using the HLM software program and also learn about model comparison, centering decisions and their consequences, and available resources for doing three-level models. Time will be allotted for participants to work on practice exercises, with several instructors available to mentor and answer questions. Participants should have a solid understanding of OLS regression and a basic understanding of MLM. Prior experience using a statistical software program, such as Stata or SPSS, is helpful. Prior knowledge about ILSAs or prior experience using the respective databases or HLM software is not required. To fully participate in the hands-on demonstrations and example analyses, participants should bring their own laptops with HLM software (a free student version is available), which works in Windows and Parallels Desktop on Macs. Rathbun, A., D., Huang, F., Meinck, S., Park, B., Ikoma, S., & Zhang, Y. (2020, April). Multilevel modeling with large-scale international datasets. Professional development course presented at the annual meeting of the American Educational Research Association, San Francisco, CA.

Experiments in psychology or education often use logistic regression models (LRMs) when analyzing binary outcomes. However, a challenge with LRMs is that results are generally difficult to understand. We present alternatives to LRMs in the analysis of experiments and discuss the linear probability model, the log-binomial model, and the modified Poisson regression model. A Monte Carlo simulation assessed bias in point estimates and standard errors as well as power and Type I error rates of the different methods. Findings show that the linear probability and the modified Poisson regression models are valid, unbiased, and in some cases, better alternatives to the LRM when the predictor of interest is a binary variable. An applied example is provided as well.

2019

Among econometricians, instrumental variable (IV) estimation is a commonly used technique to estimate the causal effect of a particular variable on a specified outcome. However, among applied researchers in the social sciences, IV estimation may not be well understood. Although there are several IV estimation primers from different fields, most manuscripts are not readily accessible by researchers who may only be familiar with regression-based techniques. This presentation provides a conceptual framework of why and how IV works in the context of evaluating treatment effects using randomized evaluations. I discuss the issue of imperfect treatment compliance, explain the logic of IV estimation, provide a sample dataset, and syntax for conducting IV analysis using R.

Week long R workshop: Aug 12 - 16, 2019.

Data from international large-scale assessments (ILSAs) reflect the nested structure of education systems and is, therefore, very well suited for multilevel modeling (MLM). However, because these data come from complex cluster samples, there are methodological aspects that a researcher needs to understand when doing MLM, e.g., the need for using sampling weights and multiple achievement values for parameter estimation. This course will teach participants how to do MLM with data from ILSAs, such as PIRLS, TIMSS, and PISA. The content of the course will include an overview of the ILSAs and a presentation on the design of these studies and databases and implications for MLM analysis. Participants will learn how to specify two-level models using the HLM software program and also learn about model comparison, centering decisions and their consequences, and available resources for doing three-level models. Time will be allotted for participants to work on practice exercises, with several instructors available to mentor and answer questions. Participants should have a solid understanding of OLS regression and a basic understanding of MLM. Prior experience using a statistical software program, such as Stata or SPSS, is helpful. Prior knowledge about ILSAs or prior experience using the respective databases or HLM software is not required. To fully participate in the hands-on demonstrations and example analyses, participants should bring their own laptops with HLM software (a free student version is available), which works in Windows and Parallels Desktop on Macs. . Miller, D., Huang, F., Meinck, S., Park, B., Ikoma, S., & Zhang, Y. (2019, April). Multilevel modeling with large-scale international datasets. Professional development course presented at the annual meeting of the American Educational Research Association, Toronto, Canada.

2018

Short introduction to regression discontinuity designs (with examples and R code). Second of two parts.

Short introduction to regression discontinuity designs (with examples and R code). First of two parts.

Data from international large-scale assessments (ILSAs) reflect the nested structure of education systems and is, therefore, very well suited for multilevel modeling (MLM). However, because these data come from complex cluster samples, there are methodological aspects that a researcher needs to understand when doing MLM, e.g., the need for using sampling weights and multiple achievement values for parameter estimation. This course will teach participants how to do MLM with data from ILSAs, such as PIRLS, TIMSS, and PISA. The content of the course will include an overview of the ILSAs and a presentation on the design of these studies and databases and implications for MLM analysis. Participants will learn how to specify two-level models using the HLM software program and also learn about model comparison, centering decisions and their consequences, and available resources for doing three-level models. Time will be allotted for participants to work on practice exercises, with several instructors available to mentor and answer questions. Participants should have a solid understanding of OLS regression and a basic understanding of MLM. Prior experience using a statistical software program, such as Stata or SPSS, is helpful. Prior knowledge about ILSAs or prior experience using the respective databases or HLM software is not required.