In November of 2013, I published a blog post about a “working paper” released by the National Bureau of Economic Research (NBER) and written by authors Thomas Dee – Economics and Educational Policy Professor at Stanford, and James Wyckoff – Economics and Educational Policy Professor at the University of Virginia. In the study titled “Incentives, Selection, and Teacher Performance: Evidence from IMPACT,” Dee and Wyckoff (2013) analyzed the controversial IMPACT educator evaluation system that was put into place in Washington DC Public Schools (DCPS) under the then Chancellor, Michelle Rhee. In this paper, Dee and Wyckoff (2013) presented what they termed to be “novel evidence” to suggest that the “uniquely high-powered incentives” linked to “teacher performance” via DC’s IMPACT initiative worked to improve the performance of high-performing teachers, and that dismissal threats worked to increase the voluntary attrition of low-performing teachers, as well as improve the performance of the students of the teachers who replaced them.
I critiqued this study in full (see both short and long versions of this critique here), and ultimately asserted that the study had “fatal flaws” which compromised the (exaggerated) claims Dee and Wyckoff (2013) advanced. These flaws included but were not limited to that only 17% of the teachers included in this study (i.e., teachers of reading and mathematics in grades 4 through 8) were actually evaluated under the value-added component of the IMPACT system. Put inversely, 83% of the teachers included in this study about teachers’ “value-added” did not have student test scores available to determine if they were indeed of “added value.” That is, 83% of the teachers evaluated, rather, were assessed on their overall levels of effectiveness or subsequent increases/decreases in effectiveness as per only the subjective observational and other self-report data include within the IMPACT system. Hence, while authors’ findings were presented as hard fact, given the 17% fact, their (exaggerated) conclusions did not at all generalize across teachers given the sample limitations, and despite what they claimed.
In short, the extent to which Dee and Wyckoff (2013) oversimplified very complex data to oversimplify a very complex context and policy situation, after which they exaggerated questionable findings, was of issue, that should have been reconciled or cleared out prior to the study’s release. I should add that this study was published in 2015 in the (economics-oriented and not-educational-policy specific) Journal of Policy Analysis and Management (see here), although I have not since revisited the piece to analyze, comparatively (e.g., via a content analysis), the original 2013 to the final 2015 piece.
Anyhow, they are at it again. Just this past January (2017) they published another report, albeit alongside two additional authors: Melinda Adnot – a Visiting Assistant Professor at the University of Virginia, and Veronica Katz – an Educational Policy PhD student, also at the University of Virginia. This study titled “Teacher Turnover, Teacher Quality, and Student Achievement in DCPS,” was also (prematurely) released as a “working paper” by the same NBER, again, without any internal or external vetting but (irresponsibly) released “for discussion and comment.”
Hence, I provide below my “discussion and comments” below, all the while underscoring how this continues to be problematic, also given the fact that I was contacted by the media for comment. Frankly, no media reports should be released about these (or for that matter any other) “working papers” until they are not only internally but also externally reviewed (e.g., in press or published, post vetting). Unfortunately, as they too commonly do, however, NBER released this report, clearly without such concern. Now, we as the public are responsible for consuming this study with much critical caution, while also advocating that others (and helping others to) do the same. Hence, I write into this post my critiques of this particular study.
First, the primary assumption (i.e., the “conceptual model”) driving this Adnot, Dee, Katz, & Wyckoff (2016) piece is that low-performing teachers should be identified and replaced with more effective teachers. This is akin to the assumption noted in the first Dee and Wyckoff (2013) piece. It should be noted here that in DCPS teachers rated as “Ineffective” or consecutively as “Minimally Effective” are “separated” from the district; hence, DCPS has adopted educational policies that align with this “conceptual model” as well. Interesting to note is how researchers, purportedly external to DCPS, entered into this study with the same a priori “conceptual model.” This, in and of itself, is an indicator of researcher bias (see also forthcoming).
Nonetheless, Adnot et al.’s (2016) highlighted finding was that “on average, DCPS replaced teachers who left with teachers who increased student achievement by 0.08 SD [standard deviations] in math.” Buried further into the report they also found that DCPS replaced teachers who left with teachers who increased student achievement by 0.05 SD in reading (at not a 5% but a 10% statistical significance level). These findings, in simpler but also more realistic terms, mean that (if actually precise and correct, also given all of the problems with how teacher classifications were determined at the DCPS level), “effective” mathematics teachers who replaced “ineffective” mathematics teachers increased student achievement by approximately 2.7%, and “effective” reading teachers who replaced “ineffective” reading teachers increased student achievement by approximately 1.7% (at not a 5% but a 10% statistical significance level). These are hardly groundbreaking results as these proportional movements likely represented one or maybe two total test items on the large-scale standardized tests uses to assess DCPS’s policy impacts.
Interesting to also note is that not only were the “small effects” exaggerated to mean so much more than what they are actually worth (see also forthcoming), but also that only the larger of the two findings – the mathematics finding – is highlighted in the abstract. The complimentary and smaller reading effect is actually buried into the text. Also buried is that these findings pertain only to grade four and eight, general education teachers who were value-added eligible, akin to Dee and Wyckoff’s (2013) earlier piece (e.g., typically 30% of a school’s population, although Dee and Wyckoff’s (2013) piece marked this percentage at 17%).
As mentioned prior, none of this would have likely happened had this piece been internally and/or externally reviewed prior to this study’s release.
Regardless, Adnot et al. (2016) also found that the attrition of relatively higher-performing teachers (e.g., “Effective” or “Highly Effective”) had a negative but also statistically insignificant effect.
Hence, it can be concluded that the only “finding” highlighted in the abstract of this piece was not the only “finding,” but rather buried in this piece were these other findings that researchers (perhaps) purposefully buried into the text. It is possible, in other words, that because these other findings did not support researchers a priori conclusions and claims, researchers chose not to bring attention to these findings, or rather the lack thereof (e.g., in the abstract).
Related, I should note that in a few places the authors exaggerate how, for example, teachers’ effects on their students’ achievement are so tangible, without any mention of contrary reports, namely as published by the American Statistical Association (ASA), in which the ASA evidenced that these (oft-exaggerated) teacher effects account for no more than 1%-14% of the variance in students’ growth scores (see more information here). In fact, teacher effectiveness is very likely not “qualitatively large” as Adnot et al. (2016) argue, without evidence in this piece, and also imply throughout this piece as a foundational part of their aforementioned “conceptual model.”
Likewise, while most everyone would likely agree that there are “stark inequities” in students’ access to effective teachers, how to improve this condition is certainly of great debate, as also not explicitly or implicitly acknowledged throughout this piece. Rather, much disagreement and debate, in fact, still exist regarding whether inducing teacher turnover will get “us” really anywhere in terms of school reform, as also related to how big (or small) teachers’ effects on students’ measurable performance actually are as discussed prior. Accordingly, and perhaps not surprisingly, Adnot et al. (2016) cite only the articles of other researchers, or rather members of their proverbial choir (e.g, Eric Hanushek who, without actual evidence, has been hypothesizing about how replacing “ineffective” teachers with “effective” teachers will reform America’s schools for nearly now one decade) to support these same a priori conclusions. Consequently, the professional integrity of the researchers must be put into check given these simple (albeit biased) errors.
Taking all of this into consideration, I would hardly call the findings they advanced in this piece (and emphasized in the abstract) solid indicators of the “overall positive effects of teacher turnover,” with only one statistically but not practically significant finding of note in mathematics (i.e., a 2.7% increase if accurate)? None of this could/should, accordingly, lead anyone to conclude that “the supply of entering teachers appears to be of sufficient quality to sustain a relatively high turnover rate.”
Hence, this is yet another case of these authors oversimplifying very complex data to oversimplify a very complex context and policy situation, after which they exaggerated negligible findings while also dismissing others.
Related, would we not expect greater results given teachers who are deemed highly effective are to be given one-time bonuses of up to $25,000, and permanent increases to teachers’ base salaries of up to $27,000 per year? This bang, or lack thereof, may not be worth the buck, either.
Additionally, is an annual attrition rate of “low-performing teachers” (e.g., classified as such for one or two consecutive years) in the district, currently hanging at around 46% worth these diminutive results?
Did they also actually find, overall, that “high-poverty schools actually improve as a result of teacher turnover?” I don’t think so, but do give this study a full read to test their as well as my conclusions for yourself (see, again, the full study here).
In the end, Adnot et al. (2016) do conclude that they found “that the overall effect of teacher turnover in DCPS conservatively had no effect on achievement and, under reasonable assumptions, improved achievement.” This is a MUCH more balanced interpretation of this study, although I would certainly question their “reasonable assumptions” (see also prior). Moreover, it is much more curious as to why we had to wait for the actual headline of this study until the end. This is especially important given that others, including members of the media, public, and policy making community, might not make it that far (i.e., trusting only what is in the abstract).
Adnot, M., Dee, T., Katz, V., & Wyckoff, J. (2016). Teacher turnover, teacher quality, and student achievement in DCPS [Washington DC Public Schools]. Cambridge, MA: National Bureau of Economic Research (NBER). Retrieved from http://www.nber.org/papers/w21922.pdf
Dee, T., & Wyckoff, J. (2013). Incentives, selection, and teacher performance: Evidence from IMPACT. National Bureau of Economic Research (NBER). Retrieved from http://www.nber.org/papers/w19529.pdf