More VAM Opposition from Weingarten

In a recent post, I wrote that Randi Weingarten, the current president of the American Federation of Teachers (AFT), has (finally) expressed her full opposition against the use of value-added models (VAMs) to evaluate and measure teacher effectiveness. She has elaborated on her reasons why in a recent article, also about the Common Core and its relationship with VAMs, in the Huffington Post.

She writes: “Just look at what’s happened with the over-reliance on tests and value-added methodology (VAM). VAM is an incomprehensible formula, at least to those who don’t have a Ph.D. in advanced statistics, which attempts to predict how a teacher’s students will score in the future by using past test scores and other various assumptions — and then compares that prediction to actual results. Like predicting the weather, VAM is subject to many factors that influence the final result. That VAM score is then used to sort, rank and evaluate teachers.

The AFT has always been leery about VAM — and we’ve said since day one that VAM should never be the singular measure of student learning used to evaluate teachers. In fact, I questioned the fairness, accuracy and reliability of value-added metrics in a 2007 New York Times column. We have enough evidence today to make it clear that not only has VAM not worked, it’s been really destructive and it’s emboldened those seeking to turn public education into a numbers game.

Pittsburgh teachers acted in good faith to partner with the district on an evaluation system that included VAM with multiple measures of student learning. But while the system was being designed, anti-public education legislation was passed in Pennsylvania that hijacked a promising professional growth system by making it a numbers game fixated on ranking, sorting and firing teachers.

In Florida, the system went completely haywire, giving teachers value-added scores for students they had never taught or who weren’t even in the same building. One example is Mrs. Cook, an elementary school teacher who was named teacher of the year by her colleagues but was labeled unsatisfactory based on a VAM score calculated the performance of students she hadn’t taught.

In 2011, the average margin of error for VAM scores in New York City was plus or minus 28 points.

We have heard similar stories in Los Angeles, New Mexico, Houston and elsewhere. But what happened in Washington, D.C., was really the last straw. Last month, right before the holiday break, the district announced that some VAM scores were incorrect due to a technical glitch — a technical glitch that affected the lives and livelihoods of the educators who received these scores. As of today, 44 teachers have been told their scores from last year were wrong (including one teacher who was fired). And the district’s response was simply to say it was a minor issue. Would the district have the same reaction if it involved 44 students? When you use a system for such high stakes–a system that lacks transparency, accuracy and reliability on so many levels–how can you ever expect the teachers to trust the system?

I may have labeled VAM a sham, but many others built the evidence base for it.

The RAND Corp. and the Board on Testing and Assessment of the National Research Council of the National Academy of Sciences both conclude that VAM results shouldn’t be used to evaluate individual teachers.

It doesn’t have to be this way.”

Follow-Up to Previous Post

I want to bring attention to a revision I made on the previous post about the 44 teachers “misclassified” in DC. I want to be clear that while only 44 teachers were officially acknowledged as having received incorrect teacher evaluation scores, this number is unquestionably much higher than that given these formulas are always “subject to error”… and actually subject to gross errors, always, across the board. Regardless of what the official reports might reveal, it should be duly noted that it was not just these 44 who were “misclassified” due to this “minor glitch.”

Thanks to Bruce Baker, Professor at Rutgers and author of School Finance 101, for the collegial reminder to clarify this point.

One of the “Forty-Four” Misclassified DC Teachers Speaks Up and Out

Two weeks ago I wrote a post about what’s going in DC’s public schools with their value-added-based teacher evaluation system, and more specifically about the 44 DC public school teachers who received “incorrect” VAM scores for the last academic year (2012-2013). While this occurred for more than just these 44 teachers because VAM formulas are always “subject to error” across the board, as per the official report just these 44 were misclassified because of a “simple” algorithmic error in the Mathematica Inc. (third party) formula used to calculate DC teachers’ scores. One of the 44 teachers was fired as a result.

Another “One of the ‘Forty-Four’ Teachers” is now speaking up and speaking out about this situation, using as a base for his reflections the email he received from the district with, most pertinent (in my opinion), all of its arbitrariness included. Check out the email he received, but also “the district’s” explanations of both the errors and the system, and in particular its weighting schema. As you read, recall another previous VAMboozled! post, whereas actual administrator and master educator scores (the scores at the source of the errors for this teacher) evidenced themselves as wholly invalid as well.

Read this DC teacher’s other thoughts as well, as they too are pertinent and very personal. My favorite: “What is DCPS’ plan for re-instituting the one teacher who was ‘fired mistakenly?’ I may not speak legalese, but I’m sure there are legal ramifications for this ‘error.’ Side note, suggesting the teacher was ‘fired mistakenly’ is akin to saying someone was ‘robbed accidentally.'”

AFT’s Randi Weingarten (Finally) Speaking Out Against VAMs

As just posted on Diane Ravitch’s blog, Randi Weingarten, the current president of the American Federation of Teachers (AFT), has (finally) expressed her full opposition against using value-added models (VAMs), the statistical measures of utmost interest on this blog, for teacher evaluation, accountability, and merit pay purposes.

I echo Diane’s sentiments that this is indeed “great news!” and that Weingarten should be saluted for her courage and insight, particularly as Weingarten has given up her previously held position, given the research evidence. She is now launching a campaign against VAMs and their (mis)uses.

As background, Randi wrote the foreword to the only academic book that has been released on VAMs to date — Value-Added Measures in Education — written by now Tulane Associate Professor of Economics, Douglas Harris. In addition, Weingarten unfortunately wrote the foreword in support of Harris’s overall (and, in my opinion, highly misguided and prematurely enthusiastic) stance on VAMs, writing things like Harris “presents a convincing argument that value-added’s imprecision need not be a deal breaker as long as we understand where it comes from and how to account for it when these measures are used in schools. We cannot expect any measures of teacher quality – value-added or others – to be perfect.” Unfortunately, Weingarten co-signed Harris’s stance that VAMs are “good enough” for their current uses and utilities, mainly riding on the fact that they are better than the other test-based accountability options used in the past. For more about Harris’s book and his overall position, read a commentary I wrote in Teachers College Record in review of his book and his “good enough” stance.

As per a recent post on politico.com, Weingarten’s new mantra is that “VAM is a sham.” This is “a notable shift for the AFT and its affiliates, which have previously ratified contracts and endorsed evaluation systems that rely on VAM. Weingarten tells Morning Education that she has always been leery of value-added ‘but we rolled up our sleeves, acted in good faith and tried to make it work. Now, she says, she’s disillusioned.”

“What changed her mind? Weingarten points to a standoff in Pittsburgh over the implementation of a VAM-based evaluation system the union had endorsed. She says the algorithms and cut scores used to rate teachers were arbitrary. And she found the process corrosive: The VAM score was just a number that didn’t show teachers their strengths or weaknesses or suggest ways to improve. Weingarten said the final straw was the news that the contractor calculating VAM scores for D.C. teachers made a typo in the algorithm, resulting in 44 teachers receiving incorrect scores — including one who was unjustly fired for poor performance.”

“What’s next? The AFT’s newly militant stance against VAM will likely affect contract negotiations in local districts, and the union also plans to lobby the Education Department.”

A Consumer Alert Issued by The 21st Century Principal

In an excellent post just released by The 21st Century Principal the author writes about yet another two companies calculating value-added for school districts, again on the taxpayer’s dime. Teacher Match and Hanover Research are the companies specifically named and targeted for marketing and selling a series of highly false assumptions about teaching and teachers, highly false claims about value-added (without empirical research in support), highly false assertions about how value-added estimates can be used for better teacher evaluation/accountability, and highly false sales pitches about what they as value-added/research “experts” can do to help with the complex statistics needed for the above

The main points of the articles, as I see them, pulled from the main article and in order of priority follow:

  1. School districts are purchasing these “products” based entirely on the promises and related marketing efforts of these (and other) companies. Consumer Alert! Instead of accepting these (and other) companies’ sales pitches and promises that these companies’ “products” will do what they say they will, these companies must be forced to produce independent, peer-reviewed research to prove that what they are selling is in fact real. If they can’t produce the studies, they should not earn the contracts!!
  2. Doing all of this is just another expensive drain on what are already short educational resources. One district is paying over $30,000 to Teacher Match per year for their services, as cited in this piece. Related, the Houston Independent School District is paying SAS Inc. $500,000 per year for their EVAAS-based value-added calculations. These are not trivial expenditures, especially when considering the other potential research-based inititaives towards which these valuable resources could be otherwise spent.
  3. States (and the companies selling their value-added services) haven’t done the validation studies to prove that the value-added scores/estimates are valid. Again, almost always is it that the sales and marketing claims made by these companies are void of evidence that supports the claims being made.
  4. Doing all of this elevates standardized testing even higher in the decision-making and data-driven processes for schools, even though doing this is not warranted or empirically supported (as mentioned).
  5. Related, value-added calculations rely on inexpensive (aka “cheap”) large-scale tests, also of questionable validity, that still are not designed for the purposes for which they are being tasked and used (e.g., measuring growth upwards cannot be done without tests with equivalent scales, which really no tests at this point have).

The shame in all of this, besides the major issues mentioned in the five points above, is that the federal government, thanks to US Secretary of Education Arne Duncan and the Obama administration, is incentivizing these and other companies (e.g. SAS EVAAS, Mathematica) to exist, construct and sell such “products,” and then seek out and compete for these publicly funded and subsidized contracts. We, as taxpayers, are the ones consistently footing the bills.

See another recent article about the chaos a simple error in Mathematica’s code caused in Washington DC’s public schools, following another VAMboozled post about the same topic two weeks ago.

 

Student Learning Objectives, aka Student Growth Objectives, aka Another Attempt to Quantify “High Quality” Teaching

After a previous post about VAMs v. Student Growth Percentiles (SGPs) (see also VAMs v. SGPs Part II) a reader posted a comment asking for more information about the utility of SGPs, but also about the difference between SGPs and Student Growth Objectives.

“Student Growth Objectives” is a new term for an older concept that is being increasingly integrated into educational accountability systems nationwide, and also under scrutiny (see one of Diane Ravitch’s recent posts about this here). But the concept underlying Student Growth Objectives (SGOs) is essentially just Student Learning Objectives (SLOs). Why they insist on using the term “growth” in place of the term “learning” is perhaps yet another fad. Related, it also likely has something to do with various legislative requirements (e.g., Race to the Top terminologies), although evidence in support of this transition is also void.

Regardless, and put simply, an SGO/SLO is an annual goal for measuring student growth/learning of the students instructed by teachers (or principals, for school-level evaluations) who are not eligible to participate in a school’s or district’s value-added or student growth model. This includes the vast majority of teachers in most schools or districts (e.g., 70+%), because only those teachers who instruct reading/language arts or mathematics in state achievement tested grade levels, typically grades 3-8, are eligible to participate in the VAM or SGP evaluation system. Hence via the development of SGOs/SLOs, administrators and others were either unwilling to allow these exclusions to continue or forced to establish a mechanism to include the other teachers to meet some legislative mandate.

New Jersey, for example, defines an SGO as “a long-term academic goal that teachers set for groups of students and must be: Specific and measureable; Aligned to New Jersey’s curriculum standards; Based on available prior student learning data; A measure of what a student has learned between two points in time; Ambitious and achievable” (for more information click here).

Denver Public Schools has been using SGOs for many years; their 2008-2009 Teacher Handbook states that an SGO must be “focused on the expected growth of [a teacher’s] students in areas identified in collaboration with their principal,” as well as that the objectives must be “Job-based; Measurable; Focused on student growth in learning; Based on learning content and teaching strategies; Discussed collaboratively at least three times during the school year; May be adjusted during the school year; Are not directly related to the teacher evaluation process; [and] Recorded online” (for more information click here).

That being said, and in sum, SGOs/SLOs, like VAMs, are not supported with empirical work. As Jersey Jazzman summarized very well in his post about this, the correlational evidence is very weak, the conclusions drawn by outside researchers are a stretch, and the rush to implement these measures is just as unfounded as the rush to implement VAMs for educator evaluation. We don’t know that SGOs/SLOs make a difference in distinguishing “good” from “poor” teachers; and in fact, some could argue (like Jersey Jazzman does) that they don’t actually do so much of anything at all. They’re just another metric being used in the attempt to quantify “high quality” teaching.

Thanks to Dr. Sarah Polasky for this post.