Individual-Level VAM Scores Over Time: “Less Reliable than Flipping a Coin”

In case you missed it (I did), an article authored by Stuart Yeh (Associate Professor at the University of Minnesota) titled “A Reanalysis of the Effects of Teacher Replacement Using Value-Added Modeling” was (recently) published in the esteemed, peer-reviewed journal: Teachers College Record. While the publication suggests a 2013 publication date, my understanding is that the actual article was more recently released.

Regardless, it’s contents are important to share, particularly in terms of VAM-based levels of reliability, whereas reliability is positioned as follows: “The question of stability [reliability/consistency] is not a question about whether average teacher performance rises, declines, or remains flat over time. The issue that concerns critics of VAM is whether individual teacher performance fluctuates over time in a way that invalidates inferences that an individual teacher is “low-” or “high-” performing. This distinction is crucial because VAM is increasingly being applied such that individual teachers who are identified as low-performing are to be terminated. From the perspective of individual teachers, it is inappropriate and invalid to fire a teacher whose performance is low this year but high the next year, and it is inappropriate to retain a teacher whose performance is high this year but low next year. Even if average teacher performance remains stable over time, individual teacher performance may fluctuate wildly from year to year” (p. 7).

Yeh’s conclusions, then (and as based on the evidence presented in this piece) is that “VAM is less reliable than flipping a coin for the purpose of categorizing high- and low-performing teachers” (p. 19). More specifically, VAMs have an estimated, overall error rate of 59% (see Endnote 2, page 26 for further explanation).

That being said, not only is the assumption that teacher quality is a fixed characteristic (i.e., that a high-performing teacher this year will be a high-performing teacher next year, and a low-performing teacher this year will be a low-performing teacher next year) false and not supported by the available data, albeit continuously assumed by many VAM proponents, (including Chetty et al.; see, for example, here, here, and here), prior estimates that using VAMs to identify teachers is no different than the flip of a coin may actually be an underestimate given current reliability estimates (see also Table 2, p. 19; see also p. 25, 26).

In section two of this article, for those of you following the Chetty et al. debates, Yeh also critiques other assumptions supporting and behind the Chetty et al. studies (see other, similar critiques here, here, here, and here). For example, Yeh critiques the VAM-based proposals to raise student achievement by (essentially) terminating low-value-added teachers. Here, the assumption is that “the use of VAM to identify and replace the lowest-performing 5% of teachers with average teachers would increase student achievement and would translate into sizable gains in the lifetime earnings of their students” (p. 2). However, because this also assumes that “there is an adequate supply of unemployed teachers who are ready and willing to be hired and would perform at a level that is 2.04 standard deviations above the performance of teachers who are fired based on value-added rankings [and] Chetty et al. do not justify this assumption with empirical data” (p. 14), this too proves much more idealistic than realistic in the grand scheme of things.

In section three of this article, for those of you generally interested in better and in this case more cost effective solutions, Yeh discusses a number of cost-effectiveness analyses comparing 22 leading approaches for raising student achievement, the results of which suggest that “the most efficient approach—rapid performance feedback (RPF)—is approximately 5,700 times as efficient as the use of VAM to identify and replace low-performing teachers” (p. 25; see also p. 23-24).


Citation: Yeh, S. S. (2013). A re-analysis of the effects of teacher replacement using value-added modeling. Teachers College Record, 115(12), 1-35. Retrieved from

Recommended Reading — 50 Myths and Lies that Threaten America’s Public Schools: The Real Crisis in Education

A few months ago, two of the most renowned scholars in the field of education—Drs. David Berliner and Gene Glass, who are both mentors of mine here at Arizona State — wrote a book that is sure to stir up some engaging dialogue regarding public education. The two teamed up with a group of our PhD students as well as PhD students from the University of Colorado-Boulder in order to tackle what they have deemed the 50 Myths and Lies that Threaten America’s Public Schools: The Real Crisis in Education. While the book covers a wide range of topics including everything from charter schools, to bullying, to English acquisition programs, and even sex education, there are several chapters (or myths) that would likely be of particular interest to the readers of this blog (i.e., on teacher accountability and VAMs — see post forthcoming).

Otherwise, eight of the other myths, specifically, deal directly with the lies often told about teachers, including the way in which they should be evaluated based on student test scores (via VAMs). They include the following myths, again, deconstructed in this book:

  • Teachers are the most important influence in a child’s education.
  • Teachers in the United States are well-paid.
  • Merit pay is a good way to increase the performance of teachers. Teachers should be evaluated on the basis of the performance of their students. Rewarding and punishing schools for the performance of their students will improve our nation’s schools. – See forthcoming post about this myth specifically
  • Teachers in schools that serve the poor are not very talented.
  • Teach for America teachers are well trained, highly qualified, and get amazing results.
  • Subject matter knowledge is the most important asset a teacher can possess.
  • Teachers’ unions are responsible for much poor school performance. Incompetent teachers cannot be fired if they have tenure.
  • Judging teacher education programs by means of the scores that their teachers’ students get on state tests is a good way to judge the quality of the teacher education program.

At a time when corporate reformists are more interested in making money off of education than actually attempting to improve educational quality for all students, books like this are critical in helping us decipher the truth from the perpetual myths that have become the bedrock of the reform movement. Berliner and Glass do not hold back – they name names where possible and provide enough research to support their claims… but not too much to slow them down. Instead, they call upon decades’ worth of educational research, the trusty work of their students, and a bit of logic and humor in order to pack a powerful punch against those most responsible for spreading the myths and, often, flat-out lies about America’s students, teachers, and schools. This book is for teachers, parents, policymakers, school administrators, and concerned citizens alike. I definitely recommend that you add it to your reading list!