The (Relentless) Will to Quantify

ShareTweet about this on TwitterShare on Facebook8Email this to someoneShare on Google+0Share on LinkedIn0Share on Reddit0

An article was just published in the esteemed, peer-reviewed journal Teachers College Record titled, “The Will to Quantify: The “Bottom Line” in the Market Model of Education Reform” and authored by Leo Casey – Executive Director of the Albert Shanker Institute. I will summarize its key points here (1) for those of you without the time to read the whole article (albeit worth reading) and (2) just in case the link above does not work for those of you out there without subscriptions to Teachers College Record.

In this article, Casey reviews the case of New York and the state department of education’s policy attempts to use New York teachers’ value-added data to reform the state’s public schools, “in the image and likeness of competitive businesses.” Casey interrogates this state’s history given the current, market-based, corporate reform environment surrounding (and swallowing) America’s public schools within New York, but also beyond.

Recall that New York is one of our states to watch, especially since the election of Governor Cuomo into the Governor’s office (see prior posts about New York here, here, and here). Accordingly, according to Casey as demonstrated in this article, this is the state to use to demonstrate how “[t]he market model of education reform has become a prisoner to a Nietzschean will to quantify, in which the validity and reliability of the actual numbers is irrelevant.”

In New York, using the state’s large-scale standardized tests in English/language arts and mathematics, grades 3 through 8, teachers’ value-added data reports were first developed for approximately 18,000 teachers throughout the state for three school years: 2007-2010. The scores were constructed with all assurances that these scores “would not be used for [teacher] evaluation purposes,” while the state department specifically identified tenure decisions and annual rating processes as two areas where teachers’ value-added scores “would play no role.” At that time the department of education also took a “firm position” that that these reports would not be disclosed or shared outside of the school community (i.e., with the public).

Soon, thereafter, however the department of education, “acting unilaterally,” began to use the scores in tenure decisions and began to, after a series of Freedom of Information requests, release the scores to the media, who in turn released the scores to the public at large. By February of 2012, teachers’ value-added scores were published by all  major New York media.

Recall these articles, primarily about the worst teachers in New York (see, for example, here, here, and here), and recall the story of Pascale Mauclair – a sixth-grade teacher in Queens who was “pilloried” in the New York Post as the city’s “worst teacher” based solely on her value-added reports. After a more thorough investigation, however, “Mauclair proved to be an excellent teacher who had the unqualified support of her school, one of the best in the city: her principal declared without hesitation or qualification that she would put her own child in Mauclair’s class, and her colleagues met Mauclair with a standing ovation when she returned to the school after the Post’s attack. Mauclair’s undoing had been her own dedication to teaching students with the greatest needs. As a teacher of English as a Second Language, she had taken on the task of teaching small self-contained classes of recent immigrants for the last five years.”

Nonetheless, the state department of education continued (and continues) to produce data for New York teachers “with a single year of test score data, and sample sizes as low as 10…When students did not have a score in a previous year, scores were statistically “imputed” to them in order to produce a basis for making a growth measure.”

These scores had, and often continue to have (also across states), “average confidence intervals of 60 to 70 percentiles for a single-year estimate. On a distribution that went from 0 to 99, the average margins of error in the [New York scores] were, by the [state department of education’s] own calculations, 35 percentiles for Math and 53 percentiles for English Language Arts. One-third of all [scores], the [department] conceded, were completely unreliable—that is, so imprecise as to not warrant any confidence in them. The sheer magnitude of these numbers takes us into the realm of the statistically surreal.” Yet the state continues to this day in its efforts to use these data despite the gross statistical and consequential human errors present.

This is, in the words of Casey, is “a demonstration of [extreme] professional malpractice in the realm of testing.” Yet educational reformers like Governor Cuomo as well as “Michael Bloomberg, Joel Klein, and a cohort of similarly minded education reformers across the United States, the fundamental problem with American public education is that it has been organized as a monopoly that is not subject to the discipline of the marketplace. The solution to all that ails public schools, therefore, is to remake them in the image and likeness of a competitive business. Just as private businesses rise and fall on their ability to compete in the marketplace, as measured by the ‘bottom line’ of their profit balance sheet, schools need to live or die on their ability to compete with each other, based on an educational ‘bottom line.’ If ‘bad’ schools die and new ‘good’ schools are created in their stead, the productivity of education improves. But to undertake this transformation and to subject schools to market discipline, an educational “bottom line” must be established. Standardized testing and value-added measures of performance based on standardized testing provide that ‘bottom line.”

Otherwise, some of the key findings taken from other studies Casey cited in this piece are also good to keep in mind:

  • “A 2010 U.S. Department of Education study found that value-added measures in general have disturbingly high rates of error, with the use of three years of test data producing a 25% error rate in classifying teachers as above average, average, and below average and one year of test data yielding a 35% error rate.” Nothing much has changed in terms of error rates here, so this study stills stands as one of the capstone pieces on this topic.
  • “New York University Professor Sean Corcoran has shown that it is hard to isolate a specific teacher effect from classroom factors and school effects using value-added measures, and that in a single year of test scores, it is impossible to distinguish the teacher’s impact. The fewer the years of data and the smaller the sample (the number of student scores), the more imprecise the value-added estimates.”
  • Also recall that “the tests in question [are/were] designed for another purpose: the measure of student performance, not teacher or school performance.” That is, these tests were designed and validated to measure student achievement, BUT they were never designed or validated for their current purposes/uses: to measure teacher effects on student achievement.
ShareTweet about this on TwitterShare on Facebook8Email this to someoneShare on Google+0Share on LinkedIn0Share on Reddit0

2 thoughts on “The (Relentless) Will to Quantify

  1. Can anyone imagine any business calculating its “bottom line” using data with errors as large as these? Or using such unreliable data for any purpose?

  2. In the old movies, you knew that vampires needed blood. In the new horror show of assessments, VAMpires need numbers. Some statea solve this problem by creating EOC (end-of-course) exams for all kinds of subjects. In this way they can assign values to teachers without having to derive their scores from students they don’t teach.

    However, this does lead to the adsurdity of EOC exams in such subjects as:

    Agricultural Mechanics
    Drivers Education
    Music
    Drama
    Ceramics

    As an example, check out the bizarre administration manuals for music where the requirement for a second exam grader for the performance part of the test results in a kabuki dance of a qualified music teacher trying to explain to his colleague how to judge music performance. By design, the second judge does not need to have any musical skills or knowledge! However, I am certain the psychometricians are required to calculate inter-rate reliability for the test! Numbers rule!

    More real and amazing EOC exams can be found on the New Mexico PED public website:

    http://www.ped.state.nm.us/AssessmentAccountability/AssessmentEvaluation/EOC/EOCBlueprints.html

    Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *