Last Tuesday, Eduardo Porter – writer of the Economic Scene column for The New York Times – wrote an excellent article, from an economics perspective, about that which is happening with our current obsession in educational policy with “Grading Teachers by the Test.” Do give the article a full read; it’s well worth it, and also consider here some of what I see as Porter’s strongest points for ponder.
Porter writes about what economist’s often refer to as Goodhart’s Law, which states that “when a measure becomes the target, it can no longer be used as the measure.” This occurs given the value (mis)placed on any measure, and the distortion (i.e., in terms of artificial inflation or deflation, depending on the desired direction of the measure) that often-to-always comes about as a result.
Remember when the Federal Aviation Administration was to hold airlines “accountable” for their on-time arrivals? Airlines responded by meeting the “new and improved” FAA standards by merely extending the estimated time of arrivals (ETAs) on the back end. Airlines did not do anything if much else of instrumental value anywhere else. As cited by Porter, the same thing happens when U.S. hospitals “do whatever it takes to keep patients alive at least 31 days after an operation, to beat Medicare’s 30-day survival yardstick.” This is Goodhart’s law at play.
In education we commonly refer to Goodhart’s Law’s interdisciplinary cousin — Campbell’s Law instead, which states that “the more any quantitative social indicator (or even some qualitative indicator) is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.” In his 1976 paper, Campbell wrote that “achievement tests may well be valuable indicators of general school achievement under conditions of normal teaching aimed at general competence. But when test scores become the goal of the teaching process, they both lose their value as indicators of educational status and distort the educational process in undesirable ways” (click here for more information). This is the main point of this New York Times piece. See also a classic article on test score pollution here and a fantastic book all about this here.
My colleague Jesse Rothstein – Associate Professor of Economics at University of California – Berkeley – is cited as saying “We don’t know how big a deal this is…[but]…It is one of [our] main concerns.” I would add that while we cannot quantify this or how often this actually occurs, nor will we every likely be able to do so, we can state with 100% certainty this is, indeed, a very big deal. Porter agrees, citing multiple research studies in support (including one of mine here and another of my former PhD student’s here).
See also a research study I published about this with a set of colleagues in 2010, titled “Cheating in the first, second, and third degree: Educators’ responses to high-stakes testing,” in which we got as close as anyone of whom I am aware to understanding not only the frequencies but also the varieties of ways in which educators distort, sometimes consciously and sometimes subconsciously, their students’ test scores. The also often times engage in such practices believing that their test-preparation and other test-related practices are in their students’ best interests. Note, however, that this study took place before even more consequences were attached to students’ test scores, given the federal government’s more recent fascination with holding teachers accountable for their quantifiable-by-student-test-scores “value-added.”
It is a fascinating phenomenon, really, albeit highly unfortunate given the consequences those external to America’s public education system (e.g,. educational policymakers) continue to tie to teachers’ test scores, regardless, and now more than ever before.
As underscored by Porter, “American education has embarked upon a nationwide experiment in incentive design. Prodded by the Education Department, most states have set up evaluation systems for teachers built on the gains of their students on standardized tests, alongside more traditional criteria like evaluations from principals.” All of this has occurred despite the research evidence, some of which is over 20 years old, and despite the profession’s calls (see, for example, here) to stop what, thanks to Goodhart’s and Campbell’s Laws, can often be understood as noise and nonsense.