Research Brief: Access to “Effective Teaching” as per VAMs

Researchers of a brief released from the Institute of Education Sciences (IES), the primary research arm of the United States Department of Education (USDOE), recently set out to “shed light on the extent to which disadvantaged students have access to effective teaching, based on value-added measures [VAMs]” as per three recent IES studies that have since been published in peer-reviewed journals and that include in their analyses 17 total states.

Researchers found, overall, that: (1) disadvantaged students receive less-effective teaching and have less access to effective teachers on average, that’s worth about a four-week lack of achievement in reading and about a two-week lack of achievement in mathematics as per VAM-based estimates, and (2) students’ access to effective teaching varies across districts.

On point (1), this is something we have known for years, contrary to what the authors of this brief write (i.e., “there has been limited research on the extent to which disadvantaged students receive less effective teaching than other students.” They simply dismiss a plethora of studies because researchers did not use VAMs to evaluate “effective teaching.” Linda Darling-Hammond’s research, in particular, has been critically important in this area for decades. It is a fact that, on average, students in high-needs schools that disproportionally serve the needs of disadvantaged students have less access to teachers who have certain teacher-quality indicators (e.g., National Board Certification and advanced degrees/expertise in content-areas, although these things are argued not to matter in this brief). In addition, there are also higher teacher turnover rates in such schools, and oftentimes such schools become “dumping grounds” for teachers who cannot be terminated due to many of the tenure laws currently at focus and under fire across the nation. This is certainly a problem, as is disadvantaged students’ access to effective teachers. So, agreed!

On point (2), agreed again. Students’ access to effective teaching varies across districts. There is indeed a lot of variation in terms of teacher quality across districts, thanks largely to local (and historical) educational policies (e.g., district and school zoning, charter and magnet schools, open enrollment, vouchers and other choice policies promoting public school privatization), all of which continue to perpetuate these problems. No surprise really, here, either, as we have also known this for decades, thanks to research that has not been based solely on the use of VAMs but research by, for example, Jonathan Kozol, bell hooks, and Jean Anyon to name a few.

What is most relevant here, though, and in particular for readers of this blog, is that the authors of this brief used misinformed approaches when writing this brief and advancing their findings. That is, they used VAMs to examine the extent to which disadvantaged students receive “less effective teaching” by defining “less effective teaching” using only VAM estimates as the indicators of effectiveness, and as relatively compared to other teachers across the schools and districts in which they found that such grave disparities exist. All the while, not once did they mention how these disparities very likely biased the relative estimates on which they based their main findings.

Most importantly, they blindly agreed to a largely unchecked and largely false assumption that the teachers caused the relatively low growth in scores rather than the low growth being caused by the bias inherent in the VAMs being used to estimate the relative levels of “effective teaching” across teachers. This is the bias that across VAMs is still, it seems weekly, becoming more apparent and of increasing concern (see, for example, a recent post about a research study demonstrating this bias here).

This is also the same issue I detailed in a recent post titled, “Chicken or the Egg?” in which I deconstructed the “Which came first, the chicken or the egg?” question in the context of VAMs. This is becoming increasingly important as those using VAM-based data are using them to make causal claims, when only correlational (or in simpler terms relational) claims can and should be made. The fundamental question in this brief should have been, rather, “What is the real case of cause and consequence” when examining “effective teaching” in these studies across these states? True teacher effectiveness, or teacher effectiveness along with the bias inherent in and across VAMs given the relativistic comparisons on which VAM estimates are based…or both?!?

Interestingly enough, not once was “bias” even mentioned in either the brief or its accompanying technical appendix. It seems to these researchers, there ain’t no such thing. Hence, their claims are valid and should be interpreted as such.

That being said, we cannot continue to use VAM estimates (emphasis added) to support claims about bad teachers causing low achievement among disadvantaged students when VAM researchers increasingly evidence that these models cannot control for the disadvantages that disadvantaged students bring with them to the schoolhouse door. Until these models are bias-free (which is unlikely), never can claims be made that the teachers caused the growth (or lack thereof), or in this case more or less growth than other similar teachers with different sets of students non-randomly attending different districts and schools and non-randomly assigned into different classrooms with different teachers.

VAMs are biased by the very nature of the students and their disadvantages, both of which clearly contribute to the VAM estimates themselves.

It is also certainly worth mentioning that the research cited throughout this brief is not representative of the grander peer-reviewed research available in this area (e.g., research derived via Michelle Rhee’s “Students First”?!?). Likewise, having great familiarity with the authors of not only the three studies cited in this brief, but also the others cited “in support,” let’s just say their aforementioned sheer lack of attention to bias and what bias meant for the validity of their findings was (unfortunately) predictable.

As far as I’m concerned, the (small) differences they report in achievement might as well be real or true, but to claim that teachers caused the differences because of their effectiveness, or lack thereof, is certainly false and untrue.

Citation: Institute of Education Sciences. (2014, January). Do disadvantaged students get less effective teaching? Key findings from recent Institute of Education Sciences studies. National Center for Education Evaluation and Regional Assistance. Retrieved from http://ies.ed.gov/ncee/pubs/20144010/

6 thoughts on “Research Brief: Access to “Effective Teaching” as per VAMs

  1. Hi! I apologize for the fact that this comment is not really relevant to the post you made. I just discovered your blog and really like it. I’m here to request you write a blog post (or link to a discussion or paper if one already exists) that addresses the following questions which immediately came to mind as I started learning about VAMs:

    1) So one basic point that you and others have correctly made is that VAMs do not fairly measure the impact of teachers who teach the students with whom it is hardest to make gains (ELLs, Special Ed, very high achieving students, very low achieving students, etc… I’ll call them “hard gainers” for brevity).

    Why isn’t it possible to assign some coefficient to the gains of hard gainers to correct for the fact that they are hard gainers? So if we decided that it takes 2x the effort to get a unit of improvement out of a hard gainer than it does a student who is not a hard gainer, could we just double the VAM points you get for hard gainers? I realize thats a simplistic approximation of how a VAM works but I think it gets my question across. For purposes of the question I’d like to for the sake of argument grant the pro-VAM crowd that the standardized tests accurately measure achievement.

    2) One thing I”ve heard is that VAM scores tend to vary wildly from year to year. Is there any data on that? Say, what is the average variance in VAM scores for a given teacher across several years?

    Whether you find time to answer or not, thanks for your excellent blog. Please spare no mathematical details in your answer… the technical stuff is important here.

    • Accounting for the “harder-to-teach” students is not as simple as we would all like it to be. I have been in major policy meetings in which very smart folks have discussed such “weights,” and they always come back to being highly arbitrary, and accordingly problematic. I think the bottom line, at least in terms of what I have concluded thus far, is that to rely on mathematics versus logic in all of this will a crazy person, and system, make. Also, because most of these issues happen on continuums, such weighting mechanisms are also much more complex than current approaches can handle.
      On to consistency, or the lack thereof, researchers agree these are highly unstable. Probably the most cited study out there (by Schochet and Chiang, available in the references of this blog), evidences that with good data errors occur about 30% of the time. Others suggest as much as 50% of the time, depending on how many years of data we might have (with one-year of data being least optimal). Does that make sense? Hope this helps!!

      • One of the places that VAMs fail to account for a continuum of effects is in the rotating nature of chronic absenteeism, such as when a teacher in a high absenteeism school never has the same kids in the same class period more than a few times a year. WIth scripted curriculums, and even without, this make both teaching those who come to class and those who don’t often show up and need to get caught up highly problematic. In some cases it is fair to say that a teacher is being judged on the scores of students she did not have an honest chance to teach. Since VAMs are proprietary, no impartial, external “judge” sees the formulas so no one knows if the absenteeism component is an on-off switch or an adjustment range and if so, what the range is and who decided on it based on what conception it’s effects might have on value added scores. The question of there being a point at which this single factor invalidates the use of VAM due to insufficient data is also never confronted.

      • Thanks so much for the reply. A couple follow-up questions:

        1) On weights, do you know of any papers which overview some of the weighting attempts and why they fail? Naively, it seems like it would be possible to just compare historic gains for various hard gainer groups vs the average for all student and assign an appropriate gain. I can envision problems with such an approach, but I don’t know their magnitude in practice.

        2) I read the abstract of the Schochet and Chiang paper and it seems like they are measuring the Type I and Type II error rates based on the hypotheses tested by the VAMs. I think (though I’m not sure) that this is different than what I was thinking about. What I noticed is that one hears a lot of anecdotes about VAM scores shifting wildly from year to year for a single teacher, which leads to doubts about the validity of the VAM since it is implausible that a teacher suddenly goes from fantastic to horrible then back to fantastic over a 3 year period. So I was wondering if any study exists that takes a district or state in which VAMs are used and computes the average of the absolute value of the difference between a teachers scores over any 2 scoring periods. It would also be cool to see the shape of the distribution of those averages to see if there are significant percentages of teachers for whom scores vary wildly year to year. If so, you could look at those teachers and then try to identify any obvious exogenous factors which might explain their shift. This could provide either further evidence that VAMs are highly problematic as measures of teacher effectiveness, or it could point to ways to improve VAMs.

        Thanks again for your post, your blog, and your work in general! I work on teacher quality and teacher evaluation (among other things) in a developing country and it is fascinating to learn about this stuff (we don’t use VAMs and have no plans to start).

  2. This whole business seems like judging athletes who are running downhill to be far more effective than athletes who are running up hill. I’m a high school English teacher and even I can see that the “analyses” are completely flawed. Thanks for spelling it out once again.

Leave a Reply

Your email address will not be published. Required fields are marked *