In a forthcoming study (to be published in the peer reviewed journal Educational Policy), University of Missouri Economists – Cory Koedel, Mark Ehlert, Eric Parsons, and Michael Podgursky – evidence how poverty should be included as a factor when evaluating teachers using VAMs. This study, also covered in a recent article in Education Week and another news brief published by the University of Missouri, highlights how doing this “could lead to a more ‘effective and equitable’ teacher-evaluation system.”
Koedel, as cited in the University of Missouri brief, argues that using what he and his colleagues term a “proportional” system would “level the playing field” for teachers working with students from different income backgrounds. They evaluated three types of growth models/VAMs to evidence their conclusions; however, how they did this will not be available until the actual article is published.
While Koedel and I tend to disagree on VAMs as potential tools for teacher evaluation – whereas he believes that such statistical and methodological tweaks will bring us to a level of VAM perfection I believe is likely impossible – the important takeaway from this piece is that VAMs are (more) biased when such background and resource sensitive factors are excluded from VAM-based calculations.
Accordingly, while Koedel writes elsewhere (in a related Policy Brief) that using what they term a “proportional’ evaluation system would “entirely mitigate this concern,” we have evidence elsewhere that with even the most sophisticated and comprehensive controls available, never can VAM-based bias be “entirely” mitigated (see, for example, here and here). This is likely due to the fact that while Koedel, his colleageus, and others argue (often with solid evidence) that controlling for “observed student and school characteristics” helps to mitigate bias, there is still unobserved student and school characteristics that cannot be observed, quantified, and hence controlled for or factored out, and this (will likely forever) prevent bias’s “entire mitgation.”
Here, the question is whether we need sheer perfection. The answer is no, but when these models are applied in practice, and when particularly teachers who teach homogenous groups of students in relatively more extreme positions (e.g., disproprortionate numbers of students from high-needs, non-English proficient backgrounds) bias still matters. While, as a whole we might be less concerned about bias when such factors are included in VAMs, there (likely forever will be) cases where bias will impact individual teachers. This is where folks will have to rely on human judgment to interpret the “objective” numbers based on VAMs.
Controlling for additional variables (such as poverty) subdivides the data, effectively reducing sample sizes and therefore making it harder (or impossible) to identify real differences amidst the “noise.”
On the level of the individual teacher the sample sizes are small even before you start controlling for all the things that might make that teacher different than the next.
see also a more current version of this study here: http://www.caldercenter.org/sites/default/files/wp-80-updated-v3.pdf