This appendix presents all tables of estimates produced in conducting the simulations reported in Chapter 4 and described in more detail in Appendix C. For these simulations, we generated 1,000 data sets, each with its own pattern of missing data . By letting missing data occur at random (within defined probabilities) many many times, and then averaging statistical results across the 1,000 data sets, we ensure the robustness of the simulation findings—and of the conclusions drawn from those findings concerning the performance of the different missing data methods examined. Multiple replications also give us distributions for the impact estimates and their standard errors, reflective of the sampling variability built into the data (and present in real data).
As described in Appendix C, different scenarios are used in the simulations, defined by (a) the nature of the missing data mechanism; (b) the missing data rate (5 percent or 40 percent); and (c) whether data are missing for students within schools or for entire schools. Therefore, the appendix contains 12 tables:
Each table consists of two panels:
The goal of these simulations was to estimate the bias in the impact estimates and standard errors from using different approaches to addressing missing data. Since bias is defined by the difference between the expected value of the estimator and the true parameter value, we estimated the bias in the two key estimates in the following way:
Note that each table begins by displaying the estimates from simulations in which none of the data were missing. These estimates do not match the true parameter values exactly due to random error. For example, the impact estimate with no missing data equals 0.203, which differs from the true impact of 0.200. When none of the data are missing, the impact estimates and standard error estimates are unbiased, and the non-zero bias estimates are entirely due to sampling error.