The Washington Post also Captures the ASA’s Recent Position Statement on VAMs

Yesterday I released the second of two posts about the recent release of the American Statistical Association’s Position Statement about using VAMs for educational assessment and accountability. Below is a third post, as again warranted, given the considerable significance of this statement.

This one comes from The Washington Post – The Answer Sheet by Valerie Strauss and is pasted here, almost in full.

Strauss writes: “You can be certain that members of the American Statistical Association, the largest organization in the United States representing statisticians and related professionals, know a thing or two about data and measurement. That makes the statement that the association just issued very important for school reform.

The ASA just slammed the high-stakes “value-added method” (VAM) of evaluating teachers that has been increasingly embraced in states as part of school-reform efforts. VAM purports to be able to take student standardized test scores and measure the “value” a teacher adds to student learning through complicated formulas that can supposedly factor out all of the other influences and emerge with a valid assessment of how effective a particular teacher has been.

These formulas can’t actually do this with sufficient reliability and validity, but school reformers have pushed this approach and now most states use VAM as part of teacher evaluations. Because math and English test scores are available, reformers have devised bizarre implementation methods in which teachers are assessed on the test scores of students they don’t have or subjects they don’t teach. When Michelle Rhee was chancellor of D.C. public schools (2007-10), she was so enamored with using student test scores to evaluate adults that she implemented a system in which all adults in a school building, including the custodians, were in part evaluated by test scores.

Assessment experts have been saying for years that this is an unfair way to evaluate anybody, especially for high-stakes purposes such as pay, employment status, tenure or even the very survival of a school. But reformers went ahead anyway on the advice of some economists who have embraced the method (though many other economists have panned it). Now the statisticians have come out with recommendations for the use of VAM for teachers, principals and schools that school reformers should — but most likely won’t — take to heart…”

“…Some economists have gone so far as to say that higher VAM scores for teachers lead to more economic success for their students later in life. Work published by the National Bureau of Economic Research, done by authors Raj Chetty, John N. Friedman and Jonah E. Rockoff [the source of a prior VAMboozled! post here], has made that claim, though there are some big problems with their research, according to an analysis of their latest study published [titled “Lost of Impact, Little Value”] by the National Education Policy Center at the University of Colorado Boulder [also the source of a VAMboozled! post soon to be released]. The analysis finds a number of key problems with the report making the link between VAM of teachers and financial success of students, including the fact that their own results show that VAM calculation for teachers is unreliable…”

“…The evidence against VAM is at this point overwhelming. The refusal of school reformers to acknowledge it is outrageous.”


Research Brief: Access to “Effective Teaching” as per VAMs

Researchers of a brief released from the Institute of Education Sciences (IES), the primary research arm of the United States Department of Education (USDOE), recently set out to “shed light on the extent to which disadvantaged students have access to effective teaching, based on value-added measures [VAMs]” as per three recent IES studies that have since been published in peer-reviewed journals and that include in their analyses 17 total states.

Researchers found, overall, that: (1) disadvantaged students receive less-effective teaching and have less access to effective teachers on average, that’s worth about a four-week lack of achievement in reading and about a two-week lack of achievement in mathematics as per VAM-based estimates, and (2) students’ access to effective teaching varies across districts.

On point (1), this is something we have known for years, contrary to what the authors of this brief write (i.e., “there has been limited research on the extent to which disadvantaged students receive less effective teaching than other students.” They simply dismiss a plethora of studies because researchers did not use VAMs to evaluate “effective teaching.” Linda Darling-Hammond’s research, in particular, has been critically important in this area for decades. It is a fact that, on average, students in high-needs schools that disproportionally serve the needs of disadvantaged students have less access to teachers who have certain teacher-quality indicators (e.g., National Board Certification and advanced degrees/expertise in content-areas, although these things are argued not to matter in this brief). In addition, there are also higher teacher turnover rates in such schools, and oftentimes such schools become “dumping grounds” for teachers who cannot be terminated due to many of the tenure laws currently at focus and under fire across the nation. This is certainly a problem, as is disadvantaged students’ access to effective teachers. So, agreed!

On point (2), agreed again. Students’ access to effective teaching varies across districts. There is indeed a lot of variation in terms of teacher quality across districts, thanks largely to local (and historical) educational policies (e.g., district and school zoning, charter and magnet schools, open enrollment, vouchers and other choice policies promoting public school privatization), all of which continue to perpetuate these problems. No surprise really, here, either, as we have also known this for decades, thanks to research that has not been based solely on the use of VAMs but research by, for example, Jonathan Kozol, bell hooks, and Jean Anyon to name a few.

What is most relevant here, though, and in particular for readers of this blog, is that the authors of this brief used misinformed approaches when writing this brief and advancing their findings. That is, they used VAMs to examine the extent to which disadvantaged students receive “less effective teaching” by defining “less effective teaching” using only VAM estimates as the indicators of effectiveness, and as relatively compared to other teachers across the schools and districts in which they found that such grave disparities exist. All the while, not once did they mention how these disparities very likely biased the relative estimates on which they based their main findings.

Most importantly, they blindly agreed to a largely unchecked and largely false assumption that the teachers caused the relatively low growth in scores rather than the low growth being caused by the bias inherent in the VAMs being used to estimate the relative levels of “effective teaching” across teachers. This is the bias that across VAMs is still, it seems weekly, becoming more apparent and of increasing concern (see, for example, a recent post about a research study demonstrating this bias here).

This is also the same issue I detailed in a recent post titled, “Chicken or the Egg?” in which I deconstructed the “Which came first, the chicken or the egg?” question in the context of VAMs. This is becoming increasingly important as those using VAM-based data are using them to make causal claims, when only correlational (or in simpler terms relational) claims can and should be made. The fundamental question in this brief should have been, rather, “What is the real case of cause and consequence” when examining “effective teaching” in these studies across these states? True teacher effectiveness, or teacher effectiveness along with the bias inherent in and across VAMs given the relativistic comparisons on which VAM estimates are based…or both?!?

Interestingly enough, not once was “bias” even mentioned in either the brief or its accompanying technical appendix. It seems to these researchers, there ain’t no such thing. Hence, their claims are valid and should be interpreted as such.

That being said, we cannot continue to use VAM estimates (emphasis added) to support claims about bad teachers causing low achievement among disadvantaged students when VAM researchers increasingly evidence that these models cannot control for the disadvantages that disadvantaged students bring with them to the schoolhouse door. Until these models are bias-free (which is unlikely), never can claims be made that the teachers caused the growth (or lack thereof), or in this case more or less growth than other similar teachers with different sets of students non-randomly attending different districts and schools and non-randomly assigned into different classrooms with different teachers.

VAMs are biased by the very nature of the students and their disadvantages, both of which clearly contribute to the VAM estimates themselves.

It is also certainly worth mentioning that the research cited throughout this brief is not representative of the grander peer-reviewed research available in this area (e.g., research derived via Michelle Rhee’s “Students First”?!?). Likewise, having great familiarity with the authors of not only the three studies cited in this brief, but also the others cited “in support,” let’s just say their aforementioned sheer lack of attention to bias and what bias meant for the validity of their findings was (unfortunately) predictable.

As far as I’m concerned, the (small) differences they report in achievement might as well be real or true, but to claim that teachers caused the differences because of their effectiveness, or lack thereof, is certainly false and untrue.

Citation: Institute of Education Sciences. (2014, January). Do disadvantaged students get less effective teaching? Key findings from recent Institute of Education Sciences studies. National Center for Education Evaluation and Regional Assistance. Retrieved from

The Trojan Horse in South Carolina

Following one of our most recent posts re: Rhee in South Carolina, a reader (Ian Kay) wrote in the following comment I thought important to share out with the rest of you.

Ian writes:

[Rhee] will succeed here in south Carolina, because she has the backing of the monied interests, including Gates, and Jeff Bezos and others. The newspapers will not print any criticism of this bunch, because they are in collusion with the goals of this movement to denigrate the teaching profession, and reap the money which is always there, in good times and in bad. I tried many times to warn the community here in Charleston about this fraud, to no avail. Here is a sample of the kind of letter to editor which they will not print. The truth from a veteran teacher will not see the light of day [until now – thanks for writing!].

The Trojan Horse is Here.

As we veteran teachers predicted, the death knell for public education is spreading slowly and inexorably throughout the country, and as the Post and Courier reported last week, Michelle Rhee and her for-profit destroyers have claimed two more districts with their fraudulent promise to improve the educational system. The State Board of Education, obviously with the blessings of the Superintendent, has approved the extension of the “non-profit” Teach for America program by placing two more districts here in the low country within their grasp. If the public believes that this is a non-profit agenda, I still have that Brooklyn Bridge for sale at a reasonable price. I wonder how much they promised the board, the Superintendent, and all the palms they had to grease to get this program on board? I am a 28 year veteran teacher who stayed the course, and is no longer intimidated by administrators and fraudulent educational leaders. What can they do to me, stop my social security check? It’s time that someone tells the public what is really going on in the school systems, and in these contrived fiefdoms known as districts, that not only abuse the teaching staff, but waste taxpayer money to fix what they have caused in the first place. Where is the outrage on the part of teachers to the article detailing the hiring of outsourced substitutes to Kelly Educational Staffing. The School District says that they want to expand the concept to the entire district this fall. Human Resources director Julie Rogers is quoted as saying that “They do a really good job of training their substitutes. It’s not just a warm body, they get in there and teach” This is proof positive that the present educational leadership is not capable of operating a school system. Where is the indignation from the educational departments of our local colleges that train teachers? I could rail for a long time, but you get the picture I’m sure. It’s sickening and it’s time to return the system to some normalcy, and discard these frauds posing as educational leaders, before the entire system implodes.

Ian Kay
West Ashley

Rhee Coming to AZ?!?

Speaking of Rhee, mentioned in a prior post today, it seems that she is to be in my state of Arizona, as part of her most recent “tour.”

Arizona, continuously basking in the glow of negative attention, coming largely from the uber conservative policies supported by our current Republican Governor Jan Brewer, is to potentially have another gubernatorial leader – Scott Smith.

I was sent, in secret, an invitation to a “Reception in Honor” of gubernatorial candidate Smith…and his special guest of the evening…[insert drum roll]…Michelle Rhee.

The party is to celebrate their collective stance towards “Efficient, Effective & Accountable Leadership for Arizona.” Sound familiar?

Anyhow, a ticket to this event comes at a cost of a minimum campaign contribution set at $250 per person to a maximum campaign contribution of $4,000 per person. The event is to be held tonight, Friday, February 21, from 5:00-6:00 (yes, for one hour) at The Montelucia Resort in a wealthy suburb within the metropolitan Phoenix area called Paradise Valley.

It seems wealth is a politician’s common denominator of power, now doesn’t it?

On that note, I heard and then verified recently, that current AZ Governor Brewer’s highest level of education is that of an x-ray technician with her highest degree earned being a radiological technologist certificate. That too says a lot about the power of money and the power money can bring for (too often, in the case of Arizona) some very, let’s say, misguided policies. Thanks to Rhee, she can help my state in its misguidedness.

As related to Arizona’s teacher accountability system in particular, so far, the state department of education has at least tried to maintain some sanity in terms of its teacher accountability and related VAM-based policies, leaving much of this to be determined by and in the hands of districts and schools who still very much honor and appreciate their local control. I like to think that I have had at least a little to do with this current stance, while not ideal but reasonable given the current sociopolitical circumstances of my state.

But it looks like if this candidate wins, we might be in way worse shape than we are now, despite our best intentions…and, again, with special thanks to policy clots like Rhee.

Michelle Rhee Rhumbuggery

Michelle Rhee who, as many of you know, is the founder and current CEO of StudentsFirst, as well as former Chancellor of Washington D.C.’s public schools who during her tenure there, enacted a strict, controversial teacher evaluation system (i.e., IMPACT) that has been at the source of different posts here and here, most recently following the “gross” errors in 44 D.C. public school teachers’ evaluation scores.

Well, it now seems that she is on tour. She just recently testified before South Carolina’s K-12 Subcommittee of the House Education and Public Works Committee, specifically  “about legislation regarding improving teacher evaluations and rewarding effective teachers in the public school system in S.C.” It worked so well in D.C., right?!?

This “rock star,” yet again, brought with her her one string guitar, speaking in favor of South Carolina’s House Bill 4419 that “addresses the way teachers are evaluated [i.e., using VAMs], rewards effective teachers with recognition and the opportunity to earn higher salaries [i.e., merit pay] and gives school leaders the opportunity and the tools to build and maintain a quality team of teachers [i.e., humbuggery, for the lack of a more colorful term].”

She also brought with her her typical and emotive chorus lines:

  • “[M]ore and more state legislators are beginning to understand the crisis that we are in in America as it pertains to our public education system.”
  • “[A]s a nation, we are not doing everything that we should be doing to ensure that we are providing all children with the excellent education that they deserve.”
  • “[C]hildren who are in school today will be the first generation of Americans to be less well educated than their parents were for the first time in the history of our country.”
  • The U.S. has “recently dropped to becoming about 15th, 17th and 26th in reading, science and math (respectively).”
  • “Instead of competing against students across the country for jobs, students in U.S. public schools will be competing for jobs against children in Singapore, China and Korea.”

And her best one-liner yet.

  • “In so many school districts across the country, you’ll hear stories about how people will come in, you know, as long as you pass the criminal check, and you’ve got a pulse, you can get a job in the classroom, and then once you have that job, you know, you have that job forever.”

Rhetoric or reality? I would never intend to insult the intelligence of the readers of this blog, so I will just leave it at that.

But it seems that at least one educator presented a counterpoint to Rhee’s testimony, although he unfortunately applauded some of Rhee’s efforts and used more gentle tactics to dissuade the bill’s passing.

A local professor, however, noted that state legislators should really “read the research [emphasis added] and bring experts and educators in on the conversation before a decision is made [emphasis added].”

I sure do hope legislators at least begin to heed this call, and begin to deconstruct and better understand her humbuggery, or more specifically, the fraudulent soundbites Rhee continues to use to advance her highly misguided educational policies and vested interests.

Follow-Up to Previous Post

I want to bring attention to a revision I made on the previous post about the 44 teachers “misclassified” in DC. I want to be clear that while only 44 teachers were officially acknowledged as having received incorrect teacher evaluation scores, this number is unquestionably much higher than that given these formulas are always “subject to error”… and actually subject to gross errors, always, across the board. Regardless of what the official reports might reveal, it should be duly noted that it was not just these 44 who were “misclassified” due to this “minor glitch.”

Thanks to Bruce Baker, Professor at Rutgers and author of School Finance 101, for the collegial reminder to clarify this point.

More Value-Added Problems in DC’s Public Schools

Over the past month I have posted two entries about what’s going in in DC’s public schools with the value-added-based teacher evaluation system developed and advanced by the former School Chancellor Michelle Rhee and carried on by the current School Chancellor Kaya Henderson.

The first post was about a bogus “research” study in which National Bureau of Education Research (NBER)/University of Virginia and Stanford researchers overstated false claims that the system was indeed working and effective, despite the fact that (among other problems) 83% of the teachers in the study did not have student test scores available to measure their “value added.” The second post was about a DC teacher’s experiences being evaluated under this system (as part of the aforementioned 83%) using almost solely his administrator’s and master educator’s observational scores. Demonstrated with data in this post was how error prone this part of the DC system also evidenced itself to be.

Adding to the value-added issues in DC, it was just released by DC public school officials (the day before winter break) and then two Washington Post articles (see the first article here and the second here) that 44 DC public school teachers also received incorrect evaluation scores for the last academic year (2012-2013) because of technical errors in the ways the scores were calculated. One of the 44 teachers was fired as a result, although (s)he is now looking to be reinstated and compensated for the salary lost.

While “[s]chool officials described the errors as the most significant since the system launched a controversial initiative in 2009 to evaluate teachers in part on student test scores,” they also downplayed the situation as only impacting 44.

VAM formulas are certainly “subject to error,” and they are subject to error always, across the board, for teachers in general as well as the 470 DC public school teachers with value-added scores based on student test scores. Put more accurately, just over 10% (n=470) of all DC teachers (n=4,000) were evaluated using their students’ test scores, which is even less than the 83% mentioned above. And for about 10% of these teachers (n=44), calculation errors were found.

This is not a “minor glitch” as written into a recent Huffington Post article covering the same story, which positions the teachers’ unions as almost irrational for “slamming the school system for the mistake and raising broader questions about the system.” It is a major glitch caused both by inappropriate “weightings” of teachers’ administrator’ and master educators’ observational scores, as well as “a small technical error” that directly impacted the teachers’ value-added calculations. It is a major glitch with major implications about which others, including not just those from the unions but many (e.g., 90%) from the research community, are concerned. It is a major glitch that does warrant additional cause about this AND all of the other statistical and other errors not mentioned but prevalent in all value-added scores (e.g., the errors always found in large-scale standardized tests particularly given their non-equivalent scales, the errors caused by missing data, the errors caused by small class sizes, the errors caused by summer learning loss/gains, the errors caused by other teachers’ simultaneous and carry over effects, the errors caused by parental and peer effects [see also this recent post about these], etc.).

So what type of consequence is to be in store for those perpetuating such nonsense? Including, particularly here, those charged with calculating and releasing value-added “estimates” (“estimates” as these are not and should never be interpreted as hard data), but also the reporters who report on the issues without understanding them or reading the research about them. I, for one, would like to see them held accountable for the “value” they too are to “add” to our thinking about these social issues, but who rather detract and distract readers away from the very real, research-based issues at hand.

VAMboozled! Post “Best of the Ed Blogs”…Plus

A recent VAMboozled! post titled “Unpacking DC’s Impact, or the Lack Thereof” was selected as one of the National Education Policy Center’s (NEPC’s) “Best of the Ed Blogs.” The “Best of the Ed Blogs” features interesting and insightful blog posts on education policy and today’s most important education topics. They are highlighted and selected on the website of the NEPC.

The same post was highlighted in an article Diane Ravitch wrote for the Talking Points Memo titled “Did Michelle Rhee’s Policies In D.C. Work?” Another good read, capturing the major issues with not only DC’s teacher evaluation systems but also with national policy trends in education.



Unpacking DC’s Impact, or the Lack Thereof

Recently, I posted a critique of the newly released and highly publicized Mathematica Policy Research study about the (vastly overstated) “value” of value-added measures and their ability to effectively measure teacher quality. The study, which did not go through a peer review process, is wrought with methodological and conceptual problems, which I dismantled in the post, with a consumer alert.

Yet again, VAM enthusiasts are attempting to VAMboozle policymakers and the general public with another faulty study, this time released to the media by the National Bureau of Economic Research (NBER). The “working paper” (i.e., not peer-reviewed, and in this case not even internally reviewed by those at NBER) analyzed the controversial teacher evaluation system (i.e., IMPACT) that was put into place in DC Public Schools (DCPS) under the then Chancellor, Michelle Rhee.

The authors, Thomas Dee and James Wyckoff (2013), present what they term “novel evidence” to suggest that the “uniquely high-powered incentives” linked to “teacher performance” worked to improve the “performance” of high-performing teachers, and that “dismissal threats” worked to increase the “voluntary attrition of low-performing teachers.” The authors, however, and similar to those of the Mathematica study, assert highly troublesome claims based on a plethora of problems that had this study undergone peer review before it was released to the public and before it was hailed in the media, would not have created the media hype that ensued. Hence, it is appropriate to issue yet another consumer alert.

The most major problems include, but are not limited to, the following:

“Teacher Performance:” Probably the largest fatal flaw, or the study’s most major limitation was that only 17% of the teachers included in this study (i.e., teachers of reading and mathematics in grades 4 through 8) were actually evaluated under the IMPACT system for their “teacher performance,” or for that which they contributed to the system’s most valued indicator: student achievement. Rather, 83% of the teachers did not have student test scores available to determine if they were indeed effective (or not) using individual value-added scores. It is implied throughout the paper, as well as the media reports covering this study post release, that “teacher performance” was what was investigated when in fact for four out of five DC teachers their “performance” was evaluated only as per what they were observed doing or self-reported doing all the while. These teachers were evaluated on their “performance” using almost exclusively (except for the 5% school-level value-added indicator) the same subjective measures integral to many traditional evaluation systems as well as student achievement/growth on teacher-developed and administrator-approved classroom-based tests, instead.

Score Manipulation and Inflation: Related, a major study limitation was that the aforementioned indicators that were used to define and observe changes in “teacher performance” (for the 83% of DC teachers) were based almost entirely on highly subjective, highly manipulable, and highly volatile indicators of “teacher performance.” Given the socially constructed indicators used throughout this study were undoubtedly subject to score bias by manipulation and artificial inflation as teachers (and their evaluators) were able to influence their ratings. While evidence of this was provided in the study, the authors banally dismissed this possibility as “theoretically [not really] reasonable.” When using tests, and especially subjective indicators to measure “teacher performance,” one must exercise caution to ensure that those being measured do not engage in manipulation and inflation techniques known to effectively increase the scores derived and valued, particularly within such high-stakes accountability systems. Again, for 83% of the teachers their “teacher performance” indicators were almost entirely manipulable (with the exception of school-level value-added weighted at 5%).

Unrestrained Bias: Related, the authors set forth a series of assumptions throughout their study that would have permitted readers to correctly predict the study’s findings without reading it. This is highly problematic, as well, and this would not have been permitted had the scientific community been involved. Researcher bias can certainly impact (or sway) study findings and this most certainly happened here.

Other problems include gross overstatements (e.g., about how the IMPACT system has evidenced itself as financially sound and sustainable over time), dismissed yet highly complex technical issues (e.g., about classification errors and the arbitrary thresholds the authors used to statistically define and examine whether teachers “jumped” thresholds and became more effective), other over-simplistic treatments of major methodological and pragmatic issues (e.g., cheating in DC Public Schools and whether this impacted outcome “teacher performance” data, and the like.

To read the full critique of the NBER study, click here.

The claims the authors have asserted in this study are disconcerting, at best. I wouldn’t be as worried if I knew that this paper truly was in a “working” state and still had to undergo peer-review before being released to the public. Unfortunately, it’s too late for this, as NBER irresponsibly released the report without such concern. Now, we as the public are responsible for consuming this study with critical caution and advocating for our peers and politicians to do the same.

Teachers Accountable for Achievement Gains Now More than Ever

Forty-one states now require that students’ growth on large-scale standardized test scores (e.g., via VAMs) be used for teacher evaluation and accountability purposes. Recent studies have also demonstrated this marked increase over the last few years (see another forthcoming with my former doctoral student Clarin Collins in Teachers College Record), but this article in the Huffington Post provides a decent graphic illustrating where the nation currently is in terms of these initiatives/policies.

teacher evaluations

However, I do have to add a Consumer Alert! While the research evidence summarized in this article suggests that America public school teachers’ “average SAT scores have increased significantly over the last decade [from 2011-2008],” it is both implicitly and explicitly suggested throughout this article that this finding somehow yields evidence that current teacher evaluation/accountability reforms in this area are working? And that “we should keep heading in [this] direction?”

Notably, the “National Council on Teacher Quality attributed the rapid pace of change to the Race to the Top, the federal government competition that had recession-addled states vie for money in exchange for implementing education reforms, such as teacher evaluations.” However, the current teacher evaluation/accountability reforms did not really start until 2008 at the earliest, before Race to the Top, and well before the final year for which teachers’ SAT scores were examined here (i.e., 2008). These SAT gains have literally nothing to do with the teacher evaluation and accountability “reforms” currently underway!

It is remiss to covertly suggest in any way, shape, or form that one (i.e., increased SAT scores) is evidencing that the other (i.e., test-based accountability) works. This is not overly surprising, however, as the three groups associated with this piece include the uber-conservatives: Education Next, Michelle Rhee’s StudentsFirst, and the National Council on Teacher Quality. The simplistic agenda being advanced with data that cannot be used in ways to advance the agenda (with any validity) should be both critically and carefully consumed, especially as the unfounded statements ensue throughout this piece regardless.