Assessment of growth: A comparison of models for projecting growth

Date of Completion

January 2011


Education, Tests and Measurements|Psychology, Psychometrics




The purpose of this dissertation are to compare student growth models with respect to their ability to (a) accurately project students' scores at a future point in time; and (b) correctly classify students' proficiency levels at a future point in time. Eleven growth models derived from four different methods are compared. These are: (1) multilevel purely nested growth models (linear and quadratic); (2) two-level covariate adjusted regression models (2-level CARM); (3) Student Growth Percentile (SGP); and (4) two-level cross-classified growth models (linear and quadratic). The comparison includes models that do and do not require vertical scaling. ^ The results revealed that both in reading and mathematics assessments, overall, cross classified growth models (both linear and quadratic) have the smallest Root Mean Square error (RMSE) and the highest R-square values suggesting that cross classified models fit the data the best. In terms of accuracy of projected reading scores, while linear cross classified model (CCLGM1) is the least biased model, purely nested quadratic model that ignores the student mobility (PNQGM) is the most biased ones among all nine models. According to RMSE values, the following five models provided the most accurate projections (smaller error in the projection): a) 2-level CARM, b) SGP, c) CCLGM1, d) CCLGM2, and e) PNLGM. On the other hand, CCQGM1 and PNQGM are the two growth models that have the least projection accuracy in reading assessment. ^ In projecting mathematic scores, SGP is the least biased model is SGP whereas CCLGM2 is the most biased growth model. According to RMSE values, the two regression-based models (SGP and 2-level CARM) yielded more accurate projections than the other seven models. On the other hand, CCQGM1 and PNQGM yielded the least accuracy in projecting the mathematic scores. With respect to classification of students into proficiency levels, all methods had the most accurate classification for the below basic level and the least accurate classification for the goal level in both reading and mathematics. Policy implications of these findings are discussed. ^