I just came across this 3-minute video that you all might/should find of interest (click here for direct link to this video on YouTube; click here to view the video’s original posting on Stanford’s Center for Opportunity Policy in Education (SCOPE)).
Featured is Stanford’s Professor Emeritus – Dr. Edward Haertel – describing what he sees as two major flaws in the use of VAMs for teacher evaluation and accountability. These are two flaws serious enough, he argues, to prevent others from using VAM scores to make high-stakes decisions about really any of America’s public school teachers. “Like all measurements, these scores are imperfect. They are appropriate and useful for some purposes, but not for others. Viewed from a measurement perspective, value-added scores have limitations that make them unsuitable for high-stakes personnel decisions.”
The first problem is the unreliability of VAM scores which is attributed to noise from the data. The effect of a teacher is important, but weak when all of the other contributing factors are taken into account. The separation of the effect of a teacher from all the other effects is very difficult. This isn’t a flaw that can be fixed by more sophisticated statistical models; it is innate to the data collected.
The second problem is that the models must account for bias. The bias is the difference in circumstances faced by a teacher in a strong school and a teacher in a high-needs school. The instructional history of a student includes out of school support, peer support, and the academic learning climate of the school and VAMs do not take these important factors into account.
Thanks for sharing this – and so many other excellent resources. I would just add that in addition to the bias across dissimilar schools, I’d think there’s a bias across dissimilar groups of students within schools. I’ve heard elementary teachers talk about the differences among classes in a given year, and from year-to-year (and I don’t know of any school that practices random class assignments either). From my experience in secondary schools, I can tell you that there are significant differences among class sections. While students may be heterogeneously grouped in my English classes, they are tracked in other subjects, causing significant variations among groups that, on paper, might seem quite similar for VAM purposes. Similarly, because of scheduling quirks, I might share most of my students with teachers who have or lack skills in providing literacy and writing support in non-ELA courses – another variable that has an observable effect on my students but would be likely invisible to any VAM estimates of “my” effectiveness.