A New Book about VAMs “On Trial”

I recently heard about a new book that was written by Mark Paige — J.D. and Ph.D., assistant professor of public policy at the University of Massachusetts-Dartmouth, and a former school law attorney — and published by Rowman & Littlefield. The book is about, as per the secondary part of its title “Understanding Value-Added Models [VAMs] in the Law of Teacher Evaluation.” See more on this book, including information about how to purchase it, for those of you who might be interested in reading more, here, and also via Amazon here.

Clearly, this book is to prove very relevant given the ongoing court cases across the country (see a prior post on these cases here) regarding teachers and the systems being used to evaluate them when especially (or extremely) reliant upon VAM-based estimates for consequential decision-making purposes (e.g., teacher tenure, pay, and termination). While I have not yet read the book, I just ordered my copy the other day. I suggest you do the same, again, should you be interested in further or better understanding the federal and state law pertinent to these cases.

Notwithstanding, I also requested that the author of this book — Mark Paige — write a guest post so that you too could find out more. Here is what he wrote:

Many of us have been following VAMs in legal circles. Several courts have faced the issue of VAMs as they relate to employment law matters. These cases have tested a chief selling point (pardon [or underscore] the business reference) of VAMs: that they will effectuate, for example, teacher termination with greater ease because nobody besides the advanced statisticians and econometricians can argue with their numbers derived. In other words, if a teacher’s VAM rating is bad, then the teacher must be bad. It’s to be as simple as that. How can a court deny that, reality?

Of course, as we [should] already know, VAMs are anything but certain. Bluntly stated: VAMs are a statistical “hot mess.” The American Statistical Association, among many others, warned in no uncertain terms that VAMs cannot – and should not – be trusted to make significant employment decisions. Of course, that has not stopped many policymakers from a full-throated adoption of their use in employment and evaluation decisions. Talk about hubris.

Accordingly, I recently completed this book, again, that focuses squarely at the intersection of VAMs and the law. Its full title is “Building a Better Teacher: Understanding Value-Added Models in the Law of Teacher Evaluation” Rowman & Littlefield, 2016). Again, I provide a direct link to the book along with its description here.

To offer a bit of a sneak preview, thought, I draw many conclusions throughout the book, but one of two important take-aways is this: VAMs may actually complicate the effectuation of a teacher’s termination. Here’s one way: because VAMs are so statistically infirm, they invite plaintiff-side attorneys to attack any underlying negative decision based on these models. See, for example, Sheri Lederman’s recent New York State Supreme Court’s decision, here. [See also a related post in this blog here].

In other words, the evidence upon which districts or states rely to make significant decisions is untrustworthy (or arbitrary) and, therefore, so is any decision as based, even if in part, on VAMs. Thus, VAMs may actually strengthen a teacher’s case. This, of course, is quite apart from the fact that VAM use results in firing good teachers based on poor information, thereby contributing to the teacher shortages and lower morale (among many other parades of horribles) being reported across the nation, and now more than likely ever.

The second important take-away is this, especially given followers of this blog include many educators and administrators facing a barrage of criticisms that only “de-professionalize” them: Courts have, over time, consistently deferred to the professional judgment of administrators (and their assessment of effective teaching). The members of that august institution – the judiciary – actually believe that educators know best about teaching, and that years of accumulated experience and knowledge have actual and also court-relevant value. That may come as a startling revelation to those who consistently diminish the education profession, or those who at least feel like they and their efforts are consistently being diminished.

To be sure, the system of educator evaluation is not perfect. Our schools continue to struggle to offer equal and equitable educational opportunities to all students, especially those in the nation’s highest needs schools. But what this book ultimately concludes is that the continued use of VAMs will not, hu-hum, add any value to these efforts.

To reach author Mark Paige via email, please contact him at mpaige@umassd.edu. To reach him via Twitter: @mpaigelaw

No More EVAAS for Houston: School Board Tie Vote Means Non-Renewal

Recall from prior posts (here, here, and here) that seven teachers in the Houston Independent School District (HISD), with the support of the Houston Federation of Teachers (HFT), are taking HISD to federal court over how their value-added scores, derived via the Education Value-Added Assessment System (EVAAS), are being used, and allegedly abused, while this district that has tied more high-stakes consequences to value-added output than any other district/state in the nation. The case, Houston Federation of Teachers, et al. v. Houston ISD, is ongoing.

But just announced is that the HISD school board, in a 3:3 split vote late last Thursday night, elected to no longer pay an annual $680K to SAS Institute Inc. to calculate the district’s EVAAS value-added estimates. As per an HFT press release (below), HISD “will not be renewing the district’s seriously flawed teacher evaluation system, [which is] good news for students, teachers and the community, [although] the school board and incoming superintendent must work with educators and others to choose a more effective system.”


Apparently, HISD was holding onto the EVAAS, despite the research surrounding the EVAAS in general and in Houston, in that they have received (and are still set to receive) over $4 million in federal grant funds that has required them to have value-added estimates as a component of their evaluation and accountability system(s).

While this means that the federal government is still largely in favor of the use of value-added model (VAMs) in terms of its funding priorities, despite their prior authorization of the Every Student Succeeds Act (ESSA) (see here and here), this also means that HISD might have to find another growth model or VAM to still comply with the feds.

Regardless, during the Thursday night meeting a board member noted that HISD has been kicking this EVAAS can down the road for 5 years. “If not now, then when?” the board member asked. “I remember talking about this last year, and the year before. We all agree that it needs to be changed, but we just keep doing the same thing.” A member of the community said to the board: “VAM hasn’t moved the needle [see a related post about this here]. It hasn’t done what you need it to do. But it has been very expensive to this district.” He then listed the other things on which HISD could spend (and could have spent) its annual $680K EVAAS estimate costs.

Soon thereafter, the HISD school board called for a vote, and it ended up being a 3-3 tie. Because of the 3-3 tie vote, the school board rejected the effort to continue with the EVAAS. What this means for the related and aforementioned lawsuit is still indeterminate at this point.

“Arbitrary and Capricious:” Sheri Lederman Wins Lawsuit in NY’s State Supreme Court

Recall the New York lawsuit pertaining to Long Island teacher Sheri Lederman? She just won in New York’s State Supreme court, and boy did she win big, also for the cause!

Sheri is a teacher, who by all accounts other than her 2013-2014 “ineffective” growth score of a 1/20, is a terrific 4th grade, 18-year veteran teacher. However, after receiving her “ineffective” growth rating and score, she along with her attorney and husband Bruce Lederman, sued the state of New York to challenge the state’s growth-based teacher evaluation system and Sheri’s individual score. See prior posts about Sheri’s case here, herehere and here.

The more specific goal of her case was to seek a judgment: (1) setting aside or vacating Sheri’s individual growth score and rating her as “ineffective,” and (2) declare that the New York endorsed and implemented growth measures in use was/is “arbitrary and capricious.” The “overall gist” was that Sheri contended that the system unfairly penalized teachers whose students consistently scored well and could not demonstrated growth upwards (e.g., teachers of gifted or other high achieving students). This concern/complaint is common elsewhere.

As per a State Supreme Court ruling, just released today as written by Acting Supreme Court Justice Judge Roger McDonough (May 10, 2016), and at 15 pages in length and available in full here, Sheri won her case. She won it against John King — the then New York State Education Department Commissioner and the now US Secretary of Education (who recently replaced Arne Duncan as US Secretary of Education). The Court concluded that Sheri (her husband, her team of experts, and other witnesses) effectively established that her growth score and rating for 2013-2014 was “arbitrary and capricious,” with “arbitrary and capricious” being defined as actions “taken without sound basis in reason or regard to the facts.”

More specifically, the Court’s conclusion was founded upon: (1) the convincing and detailed evidence of VAM bias against teachers at both ends of the spectrum (e.g. those with high-performing students or those with low-performing students); (2) the disproportionate effect of petitioner’s small class size and relatively large percentage of high-performing students; (3) the functional inability of high-performing students to demonstrate growth akin to lower-performing students; (4) the wholly unexplained swing in petitioner’s growth score from 14 [i.e., her growth score the year prior] to 1, despite the presence of statistically similar scoring students in her respective classes; and, most tellingly, (5) the strict imposition of rating constraints in the form of a “bell curve” that places teachers in four categories via pre-determined percentages regardless of whether the performance of students dramatically rose or dramatically fell from the previous year.”

As per an email I received earlier today from Bruce (i.e., Sheri’s husband/attorney who prosecuted her case), the Court otherwise “declined to make an overall ruling on the [New York growth] rating system in general because of new regulations in effect” [e.g., that the state’s growth model is currently under review]…[Nontheless, t]he decision should qualify as persuasive authority for other teachers challenging growth scores throughout the County [and Country]. [In addition, the] Court carefully recite[d] all our expert affidavits [i.e., from Professors Darling-Hammond, Pallas, Amrein-Beardsley, Sean Corcoran and Jesse Rothstein as well as Drs. Burris and Lindell].” Noted as well were the “absence of any meaningful’ challenge to [Sheri’s] experts’ conclusions, especially about the dramatic swings noticed between her, and potentially others’ scores, and the other ‘litany of expert affidavits submitted on [Sheris’] behalf].”

“It is clear that the evidence all of these amazing experts presented was a key factor in winning this case since the Judge repeatedly said both in Court and in the decision that we have a “high burden” to meet in this case.” [In addition,] [t]he Court wrote that the court “does not lightly enter into a critical analysis of this matter … [and] is constrained on this record, to conclude that [the] petitioner [i.e., Sheri] has met her high burden.”

To Bruce’s/our knowledge, this is the first time a judge has set aside an individual teacher’s VAM rating based upon such a presentation in court.

Thanks to all who helped in this endeavor. Onward!

Virginia SGP’s Side of the Story

In one of my most recent posts I wrote about how Virginia SGP, aka parent Brian Davison, won in court against the state of Virginia, requiring them to release teachers’ Student Growth Percentile (SGP) scores. Virginia SGP is a very vocal promoter of the use of SGPs to evaluate teachers’ value-added (although many do not consider the SGP model to be a value-added model (VAM); see general differences between VAMs and SGPs here). Regardless, he sued the state of Virginia to release teachers’ SGP scores so he could make them available to all via the Internet. He did this, more specifically, so parents and perhaps others throughout the state would be able to access and then potentially use the scores to make choices about who should and should not teach their kids. See other posts about this story here and here.

Those of us who are familiar with Virginia SGP and the research literature writ large know that, unfortunately, there’s much that Virginia SGP does not understand about the now loads of research surrounding VAMs as defined more broadly (see multiple research article links here). Likewise, Virginia SGP, as evidenced below, rides most of his research-based arguments on select sections of a small handful of research studies (e.g., those written by economists Raj Chetty and colleagues, and Thomas Kane as part of Kane’s Measures of Effective Teaching (MET) studies) that do not represent the general research on the topic. He simultaneously ignores/rejects the research studies that empirically challenge his research-based claims (e.g., that there is no bias in VAM-based estimates, and that because Chetty, Friedman, and Rockoff “proved this,” it must be true, despite the research studies that have presented evidence otherwise (see for example here, here, and here).

Nonetheless, given that him winning this case in Virginia is still noteworthy, and followers of this blog should be aware of this particular case, I invited Virginia SGP to write a guest post so that he could tell his side of the story. As we have exchanged emails in the past, which I must add have become less abrasive/inflamed as time has passed, I recommend that readers read and also critically consume what is written below. Let’s hope that we might have some healthy and honest dialogue on this particular topic in the end.

From Virginia SGP:

I’d like to thank Dr. Amrein-Beardsley for giving me this forum.

My school district recently announced its teacher of the year. John Tuck teaches in a school with 70%+ FRL students compared to a district average of ~15% (don’t ask me why we can’t even those #’s out). He graduated from an ordinary school with a degree in liberal arts. He only has a Bachelors and is not a National Board Certified Teacher (NBCT). He is in his ninth year of teaching specializing in math and science for 5th graders. Despite the ordinary background, Tuck gets amazing student growth. He mentors, serves as principal in the summer, and leads the school’s leadership committees. In Dallas, TX, he could have risen to the top of the salary scale already, but in Loudoun County, VA, he only makes $55K compared to a top salary of $100K for Step 30 teachers. Tuck is not rewarded for his talent or efforts largely because Loudoun eschews all VAMs and merit-based promotion.

This is largely why I enlisted the assistance of Arizona State law school graduate Lin Edrington in seeking the Virginia Department of Education’s (VDOE) VAM (SGP) data via a Freedom of Information Act (FOIA) suit (see pertinent files here).

VAMs are not perfect. There are concerns about validity when switching from paper to computer tests. There are serious concerns about reliability when VAMs are computed with small sample sizes or are based on classes not taught by the rated teacher (as appeared to occur in New Mexico, Florida, and possibly New York). Improper uses of VAMs give reformers a bad name. This was not the case in Virginia. SGPs were only to be used when appropriate with 2+ years of data and 40+ scores recommended.

I am a big proponent of VAMs based on my reviews of the research. We have the Chetty/Friedman/Rockoff (CFR) studies, of course, including their recent paper showing virtually no bias (Table 6). The following briefing presented by Professor Friedman at our trial gives a good layman’s overview of their high level findings. When teachers are transferred to a completely new school but their VAMs remain consistent, that is very convincing to me. I understand some point to the cautionary statement of the ASA suggesting districts apply VAMs carefully and explicitly state their limitations. But the ASA definitely recommends VAMs for analyzing larger samples including schools or district policies, and CFR believe their statement failed to consider updated research.

To me, the MET studies provided some of the most convincing evidence. Not only are high VAMs on state standardized tests correlated to higher achievement on more open-ended short-answer and essay-based tests of critical thinking, but students of high-VAM teachers are more likely to enjoy class (Table 14). This points to VAMs measuring inspiration, classroom discipline, the ability to communicate concepts, subject matter knowledge and much more. If a teacher engages a disinterested student, their low scores will certainly rise along with their VAMs. CFR and others have shown this higher achievement carries over into future grades and success later in life. VAMs don’t just measure the ability to identify test distractors, but the ability of teachers to inspire.

So why exactly did the Richmond City Circuit Court force the release of Virginia’s SGPs? VDOE applied for and received a No Child Left Behind (NCLB) waiver like many other states. But in court testimony provided in December of 2014, VDOE acknowledged that districts were not complying with the waiver by not providing the SGP data to teachers or using SGPs in teacher evaluations despite “assurances” to the US Department of Education (USDOE). When we initially received a favorable verdict in January of 2015, instead of trying to comply with NCLB waiver requirements, my district of Loudoun County Publis Schools (LCPS) laughed. LCPS refused to implement SGPs or even discuss them.

There was no dispute that the largest Virginia districts had committed fraud when I discussed these facts with the US Attorney’s office and lawyers from the USDOE in January of 2016, but the USDOE refused to support a False Claim Act suit. And while nearly every district stridently refused to use VAMs [i.e., SGPs], the Virginia Secretary of Education was falsely claiming in high profile op-eds that Virginia was using “progress and growth” in the evaluation of schools. Yet, VDOE never used the very measure (SGPs) that the ESEA [i.e., NCLB] waivers required to measure student growth. The irony is that if these districts had used SGPs for just 1% of their teachers’ evaluations after the December of 2014 hearing, their teachers’ SGPs would be confidential today. I could only find one county that utilized SGPs, and their teachers’ SGPs are exempt. Sometimes fraud doesn’t pay.

My overall goals are threefold:

  1. Hire more Science Technology Engineering and Mathematics (STEM) majors to get kids excited about STEM careers and effectively teach STEM concepts
  2. Use growth data to evaluate policies, administrators, and teachers. Share the insights from the best teachers and provide professional development to ineffective ones
  3. Publish private sector equivalent pay so young people know how much teachers really earn (pensions often add 15-18% to their salaries). We can then recruit more STEM teachers and better overall teaching candidates

What has this lawsuit and activism cost me? A lot. I ate $5K of the cost of the VDOE SGP suit even after the award[ing] of fees. One local school board member has banned me from commenting on his “public figure” Facebook page (which I see as a free speech violation), both because I questioned his denial of SGPs and some other conflicts of interests I saw, although indirectly related to this particular case. The judge in the case even sanctioned me $7K just for daring to hold him accountable. And after criticizing LCPS for violating Family Educational Rights and Privacy Act (FERPA) by coercing kids who fail Virginia’s Standards of Learning tests (SOLs) to retake them, I was banned from my kids’ school for being a “safety threat.”

Note that I am a former Naval submarine officer and have held Department of Defense (DOD) clearances for 20+ years. I attended a meeting this past Thursday with LCPS officials in which they [since] acknowledged I was no safety threat. I served in the military, and along with many I have fought for the right to free speech.

Accordingly, I am no shrinking violet. Despite having LCPS attorneys sanction perjury, the Republican Commonwealth Attorney refused to prosecute and then illegally censored me in public forums. So the CA will soon have to sign a consent order acknowledging violating my constitutional rights (he effectively admitted as much already). And a federal civil rights complaint against the schools for their retaliatory ban is being drafted as we speak. All of this resulted from my efforts to have public data released and hold LCPS officials accountable to state and federal laws. I have promised that the majority of any potential financial award will be used to fund other whistle blower cases, [against] both teachers and reformers. I have a clean background and administrators still targeted me. Imagine what they would do to someone who isn’t willing to bear these costs!

In the end, I encourage everyone to speak out based on your beliefs. Support your case with facts not anecdotes or hastily conceived opinions. And there are certainly efforts we can all support like those of Dr. Darling-Hammond. We can hold an honest debate, but please remember that schools don’t exist to employ teachers/principals. Schools exist to effectively educate students.

“Virginia SGP” Wins in Court Against State

Virginia SGP, also known as Brian Davison — a parent of two public school students in the affluent Loudoun, Virginia area (hereafter referred to as Virginia SGP) — has been an avid (and sometimes abrasive) commentator about value-added models (VAMs), defined generically, on this blog (see, for example, here, here, and here), on Diane Ravitch’s blog (see, for example, here, here, and here), and elsewhere (e.g., Virginia SGP’s Facebook page here). He is an advocate and promoter of the use of VAMs (which are in this particular case Student Growth Percentiles (SGPs); see differences between VAMs and SGPs here and here) to evaluate teachers, and he is an advocate and promoter of the release of teachers’ SGP scores to parents and the general public for their consumption and use.

Related, and as described in a Washington Post article published in March of 2016, Virginia SGP “…Pushed [Virginia] into Debate of Teacher Privacy vs. Transparency for Parents” as per teachers’ SPG data. This occurred via a lawsuit Virginia SGP filed against the state, attempting to force the release of teachers’ SGP data for all teachers across the state. More specifically, and akin to what happened in 2010 when the Los Angeles Times published the names and VAM-based ratings of thousands of teachers teaching in the Los Angeles Unified School District (LAUSD), Virginia SGP “pressed for the data’s release because he thinks parents have a right to know how their children’s teachers are performing, information about public employees that exists but has so far been hidden. He also wants to expose what he says is Virginia’s broken promise to begin using the data to evaluate how effective the state’s teachers are.” He thinks that “teacher data should be out there,” especially if taxpayers are paying for it.

In January, a Richmond, Virginia judge ruled in Virginia SGP’s favor, despite the state’s claims that Virginia school districts, despite the state’s investments, had reportedly not been using the SGP data, “calling them flawed and unreliable measures of a teacher’s effectiveness.” And even though this ruling was challenged by state officials and the Virginia Education Association thereafter, Virginia SGP posted via his Facebook page the millions of student records the state released in compliance with the court, with teacher names and other information redacted.

This past Tuesday, however, and despite the challenges to the court’s initial ruling, came another win for Virginia SGP, as well as another loss for the state of Virginia. See the article “Judge Sides with Loudoun Parent Seeking Teachers’ Names, Student Test Scores,” published yesterday in a local Loudon, Virginia news outlet.

The author of this article, Danielle Nadler, explains more specifically that, “A Richmond Circuit Court judge has ruled that [the] VDOE [Virginia Department of Education] must release Loudoun County Public Schools’ Student Growth Percentile [SGP] scores by school and by teacher…[including] teacher identifying information.” The judge noted that “that VDOE and the Loudoun school system failed to ‘meet the burden of proof to establish an exemption’ under Virginia’s Freedom of Information Act [FOIA].” The court also ordered VDOE to pay Davison $35,000 to cover his attorney fees and other costs. This final order was dated April 12, 2016.

“Davison said he plans to publish the information on his ‘Virginia SGP’ Facebook page. Students will not be identified, but some of the teachers will. ‘I may mask the names of the worst performers when posting rankings/lists but other members of the public can analyze the data themselves to discover who those teachers are,” Virginia SGP said.

I’ve exchanged messages with Virginia SGP prior to this ruling and since, and since I’ve explicitly invited him to also comment via this blog. While with this objective and subsequent ruling I disagree, although I do believe in transparency, it is nonetheless newsworthy in the realm of VAMs and for followers/readers of this blog. Comment now and/or do stay tuned for more.

The “Vergara v. California” Decision Reversed: Another (Huge) Victory in Court

In June of 2014, defendants in “Vergara v. California” in Los Angeles, California lost their case. As a reminder, plaintiffs included nine public school students (backed by some serious corporate reformer funds as per Students Matter) who challenged five California state statutes that supported the state’s “ironclad [teacher] tenure system.” The prosecution’s argument was that students’ rights to a good education were being violated by teachers’ job protections…protections that were making it too difficult to fire “grossly ineffective” teachers. The prosecution’s suggested replacement to the “old” way of doing this, of course, was to use value-added scores to make “better” decisions about which teachers to fire and whom to keep around.

In February of 2016, “Vergara v. California” was appealed, back in Los Angeles.

Released, yesterday, was the Court of Appeal’s decision reversing the trial court’s earlier decision. As per an email I received also yesterday from one of the lawyers involved, “The unanimous decision holds that the plaintiffs did not establish their equal protection claim because they did not show that the challenged [“ironclad” tenure] laws themselves cause harm to poor students or students of color.” Accordingly, the Court of Appeal “ordered that judgment be entered for the defendants (the state officials and teachers’ unions)…[and]…this should end the case, and copycat cases in other parts of the country [emphasis added].” However, plaintiffs have already announced their intent to appeal this ruling to the California Supreme Court.

Please find attached here, as certified for publication, the actual Court of Appeal decision. See also a post here about this reversal authored by California teachers’ unions. See also here more information released by the California Teachers Association.

See also the amicus brief that a large set of deans and professors across the country contributed to/signed to help in this reversal.

Victory in New Mexico’s Lawsuit, Again

My most recent post about the state of New Mexico (here) included an explanation of a New Mexico Judge’s ruling to postpone New Mexico’s state-wide teacher evaluation trial until October 2016, with the state’s December 2015 preliminary injunction (described here) in place until (at least) then.

New Mexico’s Public Education Department (PED) recently, however, also tried to appeal the Judge’s October 2016 injunction, and took it to New Mexico’s Court of Appeals for an emergency review of the Judge’s injunction order.

The state and its PED lost, again. Here is the court order, which essentially says that the appeal was denied, and pasted below is the press release, released by the American Federation of Teachers New Mexico and Albuquerque Teachers Federation (i.e., the plaintiffs in this case).

Also here is an article just released in the Santa Fe New Mexican about this ruling, also about how the “Appeals court reject[ed the state’s] request to intervene in [this] teacher evaluation case.”


Court Denies Request from Public Education Department; Keeps Case in District Court

March 16, 2016

Contact: John Dyrcz

Albuquerque – American Federation of Teachers New Mexico (AFT NM) President Stephanie Ly and Albuquerque Teachers Federation (ATF) President Ellen Bernstein released the following statement:

“We are not surprised by today’s decision of the New Mexico Court of Appeals denying the New Mexico Public Education Department’s request for an interlocutory – or emergency – review of District Court Judge David Thomson’s injunction order. The December 2015 injunction preventing the PED from using its faulty evaluation system to penalize educators was well reasoned and the product of a fair and lengthy series of hearings over four months.

“We have maintained throughout this process that while the PED has every right to pursue all legal options under our judicial system, these frequent attempts at disrupting the progress of this case are nothing more than an attempt to stall the momentum of our efforts to seek relief for New Mexico’s education community.

“With this order, the case returns to Judge Thomson for final testimony from our expert witnesses, and we are pleased that the temporary injunction granted in December of 2015 will remain in place until at least October of 2016, when AFT NM and ATF will seek to make the injunction permanent,” said Ly and Bernstein.

Alleged Violation of Protective Order in Houston Lawsuit, Overruled

Many of you will recall a post I made public in January including “Houston Lawsuit Update[s], with Summar[ies] of Expert Witnesses’ Findings about the EVAAS” (Education Value-Added Assessment System sponsored by SAS Institute Inc.). What you might not have recognized since, however, was that I pulled the post down a few weeks after I posted it. Here’s the back story.

In January 2016, the Houston Federation of Teachers (HFT) published an “EVAAS Litigation Update,” which summarized a portion of Dr. Jesse Rothstein’s expert report in which he conclude[d], among other things, that teachers do not have the ability to meaningfully verify their EVAAS scores. He wrote that “[a]t most, a teacher could request information about which students were assigned to her, and could read literature — mostly released by SAS, and not the product of an independent investigation — regarding the properties of EVAAS estimates.” On January 10, 2016, I posted the post: “Houston Lawsuit Update, with Summary of Expert Witnesses’ Findings about the EVAAS” summarizing what I considered to be the twelve key highlights of HFT’s “EVAAS Litigation Update,” in which I highlighted Rothstein’s above conclusions.

Lawyers representing SAS Institute Inc. charged that this post, along with the more detailed “EVAAS Litigation Update” I summarized within the post (authored by the Houston Federation of Teachers (HFT) to keep their members in Houston up-to-date on the progress of this lawsuit) violated a protective order that was put in place to protect SAS’s EVAAS computer source code. Even though there is/was nothing in the “EVAAS Litigation Update” or the blog post that disclosed the source code, SAS objected to both as disclosing conclusions that, SAS said, could not have been reached in the absence of a review of the source code. They threatened HFT, its lawyers, and its experts (myself and Dr. Rothstein) with monetary sanctions. HFT went to court in order to get the court’s interpretation of the protective order and to see if a Judge agreed with SAS’s position. In the meantime, I removed the prior post (which is now back up here).

The great news is that the Judge found in HFT’s favor. He found that neither the “EVAAS Litigation Update” nor the related blog post violated the protective order. Further, he found that “we” have the right to share other updates on the Houston lawsuit, which is still pending, as long as the updates do not violate the protective order still in place. This includes discussion of the conclusions or findings of experts, provided that the source code is not disclosed, either explicitly or by necessary implication.

In more specific terms, as per his ruling in his Court Order, the judge ruled that SAS Institute Inc.’s lawyers “interpret[ed] the protective order too broadly in this instance. Rothstein’s opinion regarding the inability to verify or replicate a teacher’s EVAAS score essentially mimics the allegations of HFT’s complaint. The Litigation Update made clear that Rothstein confirmed this opinion after review of the source code; but it [was] not an opinion ‘that could not have been made in the absence of [his] review’ of the source code. Rothstein [also] testified by affidavit that his opinion is not based on anything he saw in the source code, but on the extremely restrictive access permitted by SAS.” He added that “the overly broad interpretation urged by SAS would inhibit legitimate discussion about the lawsuit, among both the union’s membership and the public at large.” That, also in his words, would be an “unfortunate result” that should, in the future, be avoided.

Here, again, are the 12 key highlights of the EVAAS Litigation Update:
  • Large-scale standardized tests have never been validated for their current uses. In other words, as per my affidavit, “VAM-based information is based upon large-scale achievement tests that have been developed to assess levels of student achievement, but not levels of growth in student achievement over time, and not levels of growth in student achievement over time that can be attributed back to students’ teachers, to capture the teachers’ [purportedly] causal effects on growth in student achievement over time.”
  • The EVAAS produces different results from another VAM. When, for this case, Rothstein constructed and ran an alternative, albeit sophisticated VAM using data from HISD both times, he found that results “yielded quite different rankings and scores.” This should not happen if these models are indeed yielding indicators of truth, or true levels of teacher effectiveness from which valid interpretations and assertions can be made.
  • EVAAS scores are highly volatile from one year to the next. Rothstein, when running the actual data, found that while “[a]ll VAMs are volatile…EVAAS growth indexes and effectiveness categorizations are particularly volatile due to the EVAAS model’s failure to adequately account for unaccounted-for variation in classroom achievement.” In addition, volatility is “particularly high in grades 3 and 4, where students have relatively few[er] prior [test] scores available at the time at which the EVAAS scores are first computed.”
  • EVAAS overstates the precision of teachers’ estimated impacts on growth. As per Rothstein, “This leads EVAAS to too often indicate that teachers are statistically distinguishable from the average…when a correct calculation would indicate that these teachers are not statistically distinguishable from the average.”
  • Teachers of English Language Learners (ELLs) and “highly mobile” students are substantially less likely to demonstrate added value, as per the EVAAS, and likely most/all other VAMs. This, what we term as “bias,” makes it “impossible to know whether this is because ELL teachers [and teachers of highly mobile students] are, in fact, less effective than non-ELL teachers [and teachers of less mobile students] in HISD, or whether it is because the EVAAS VAM is biased against ELL [and these other] teachers.”
  • The number of students each teacher teaches (i.e., class size) also biases teachers’ value-added scores. As per Rothstein, “teachers with few linked students—either because they teach small classes or because many of the students in their classes cannot be used for EVAAS calculations—are overwhelmingly [emphasis added] likely to be assigned to the middle effectiveness category under EVAAS (labeled “no detectable difference [from average], and average effectiveness”) than are teachers with more linked students.”
  • Ceiling effects are certainly an issue. Rothstein found that in some grades and subjects, “teachers whose students have unusually high prior year scores are very unlikely to earn high EVAAS scores, suggesting that ‘ceiling effects‘ in the tests are certainly relevant factors.” While EVAAS and HISD have previously acknowledged such problems with ceiling effects, they apparently believe these effects are being mediated with the new and improved tests recently adopted throughout the state of Texas. Rothstein, however, found that these effects persist even given the new and improved.
  • There are major validity issues with “artificial conflation.” This is a term I recently coined to represent what is happening in Houston, and elsewhere (e.g., Tennessee), when district leaders (e.g., superintendents) mandate or force principals and other teacher effectiveness appraisers or evaluators, for example, to align their observational ratings of teachers’ effectiveness with value-added scores, with the latter being the “objective measure” around which all else should revolve, or align; hence, the conflation of the one to match the other, even if entirely invalid. As per my affidavit, “[t]o purposefully and systematically endorse the engineering and distortion of the perceptible ‘subjective’ indicator, using the perceptibly ‘objective’ indicator as a keystone of truth and consequence, is more than arbitrary, capricious, and remiss…not to mention in violation of the educational measurement field’s Standards for Educational and Psychological Testing” (American Educational Research Association (AERA), American Psychological Association (APA), National Council on Measurement in Education (NCME), 2014).
  • Teaching-to-the-test is of perpetual concern. Both Rothstein and I, independently, noted concerns about how “VAM ratings reward teachers who teach to the end-of-year test [more than] equally effective teachers who focus their efforts on other forms of learning that may be more important.”
  • HISD is not adequately monitoring the EVAAS system. According to HISD, EVAAS modelers keep the details of their model secret, even from them and even though they are paying an estimated $500K per year for district teachers’ EVAAS estimates. “During litigation, HISD has admitted that it has not performed or paid any contractor to perform any type of verification, analysis, or audit of the EVAAS scores. This violates the technical standards for use of VAM that AERA specifies, which provide that if a school district like HISD is going to use VAM, it is responsible for ‘conducting the ongoing evaluation of both intended and unintended consequences’ and that ‘monitoring should be of sufficient scope and extent to provide evidence to document the technical quality of the VAM application and the validity of its use’ (AERA Statement, 2015).
  • EVAAS lacks transparency. AERA emphasizes the importance of transparency with respect to VAM uses. For example, as per the AERA Council who wrote the aforementioned AERA Statement, “when performance levels are established for the purpose of evaluative decisions, the methods used, as well as the classification accuracy, should be documented and reported” (AERA Statement, 2015). However, and in contrast to meeting AERA’s requirements for transparency, in this district and elsewhere, as per my affidavit, the “EVAAS is still more popularly recognized as the ‘black box’ value-added system.”
  • Related, teachers lack opportunities to verify their own scores. This part is really interesting. “As part of this litigation, and under a very strict protective order that was negotiated over many months with SAS [i.e., SAS Institute Inc. which markets and delivers its EVAAS system], Dr. Rothstein was allowed to view SAS’ computer program code on a laptop computer in the SAS lawyer’s office in San Francisco, something that certainly no HISD teacher has ever been allowed to do. Even with the access provided to Dr. Rothstein, and even with his expertise and knowledge of value-added modeling, [however] he was still not able to reproduce the EVAAS calculations so that they could be verified.”Dr. Rothstein added, “[t]he complexity and interdependency of EVAAS also presents a barrier to understanding how a teacher’s data translated into her EVAAS score. Each teacher’s EVAAS calculation depends not only on her students, but also on all other students with- in HISD (and, in some grades and years, on all other students in the state), and is computed using a complex series of programs that are the proprietary business secrets of SAS Incorporated. As part of my efforts to assess the validity of EVAAS as a measure of teacher effectiveness, I attempted to reproduce EVAAS calculations. I was unable to reproduce EVAAS, however, as the information provided by HISD about the EVAAS model was far from sufficient.”

New Mexico’s Teacher Evaluation Trial Postponed Until October, w/Preliminary Injunction Still in Place

Last December in New Mexico, a Judge granted a preliminary injunction preventing consequences from being attached to the state’s teacher evaluation data as based on the state’s value-added model (VAM). More specifically, Judge David K. Thomson ruled that the state can proceed with “developing” and “improving” its teacher evaluation system, but the state is not to make any consequential decisions about New Mexico’s teachers using the data the state collects until the state (and/or others external to the state) can evidence to the court during another trial (which was set for April of 2016) that the system is reliable, valid, fair, uniform, and the like. See more details regarding Judge Thomson’s ruling in a previous post here: “Consequences Attached to VAMs Suspended Throughout New Mexico.” See more details about this specific lawsuit, sponsored by the American Federation of Teachers (AFT) New Mexico and the Albuquerque Teachers Federation (ATF), in a previous post here: “Lawsuit in New Mexico Challenging [the] State’s Teacher Evaluation System.” This is one of the cases on which I am continuing to serve as an expert witness.

Yesterday, however, and given another state-level lawsuit that is also ongoing regarding the state’s teacher evaluation system, although this one is sponsored by the National Education Association (NEA), Judge Thomson (apparently along with Judge Francis Mathew) pushed both the AFT-NM/ATF and NEA trials back to October of 2016, yielding a six month delay for the AFT-NM/ATF hearing.

According to an article published this morning in the Santa Fe New Mexican, “To date, the [New Mexico] Public Education Department [PED] has been unsuccessful in its efforts to stop either suit or combine them;” hence, yesterday in court the state requested that the court postpone both hearings so that the state could introduce its new teacher evaluation system, on March 15 of 2016, along with its specifics and rules, as also based on the state’s new Partnership for the Assessment of Readiness for College and Careers (PARCC) test data. Recall that the state’s Secretary of Education – Hanna “Skandera is new chair of PARCC test board.” It is also anticipated, however, that the state’s new system is to still “rely heavily” (i.e., 50% weight) on VAMs. See also a related post about “New Mexico Chang[ing] its Teacher Evaluation System, But Not Really.”

This window of time is also to allow for the public forums needed to review the state’s new system, but also to allow time for “the acrimony to be resolved without trials.” The preliminary injunction granted by Judge Thomson in December, though, still remains in place. See also a related article, also published this morning, in the Albuquerque Journal.

Stephanie Ly, president of the AFT-NM, said she is not happy with the trial being postponed. She called this a “stalling tactic” to give the [state] education department more time to compile student achievement data that the plaintiffs have been requesting. “We had no option but to agree because they are withholding data,” she said.

Ly and ATF President Ellen Bernstein also responded yesterday via a joint statement, pasted in full below:

March 7, 2016

Contact: John Dyrcz — 505-554-8679

“The Public Education Department and Secretary Skandera have once again willfully delayed the AFT NM/ATF lawsuit against the current value added model [VAM] evaluation system due to their purposeful refusal to reveal the data being used to evaluate our educators in New Mexico.

“In addition to this stall tactic, and during a status hearing this morning in the First District Court, lawyers for the PED revealed that new rules and regulations were to be unveiled on March 15 by the PED, and would ‘rely heavily’ on VAM as a method of evaluation for educators.

“New Mexico educators will not cease in our fight against the abusive policies of this administration. Allowing PED or districts to terminate employees based on VAM and student test scores is completely unacceptable, it is unacceptable to allow PED or districts to refuse licensure advancement based upon VAM scores, and it is unacceptable for PED or districts to place New Mexico educators on growth plans based on faulty data.

“High-performing education systems have policies 
in place which respect and support their educators and use evaluations not as punitive measures but as opportunities for improvement. Educators, unions, and administrators should oversee the evaluation process to ensure it is thorough and of high quality, as well as fair and reliable. Educators, unions, and administrators should be involved in developing, implementing and monitoring the system to ensure it reflects good teaching well, that it operates effectively, that it is tied to useful learning opportunities for teachers, and that it produces valid results.

“It is well known the PED is in a current state of crisis with several high-level staff members abandoning the Department, an on-going whistle-blower lawsuit…the failure to produce meaningful changes to education in New Mexico during her six years as Secretary, and Skandera’s constant changes to the rules is a desperate attempt to right a sinking ship,” said Ly and Bernstein.

Vergara v. California Appeal Underway: The Case that Will Yield No Winners

In June of 2014, defendants in “Vergara v. California” in Los Angeles, California lost the case. Plaintiffs included nine public school students (backed by some serious corporate reformer funds as per Students Matter) who challenged five California state statutes that supported the state’s “ironclad [teacher] tenure system.” The prosecution’s argument was that students’ rights to a good education were being violated by teachers’ job protections…protections that were making it too difficult to fire “grossly ineffective” teachers. The prosecution’s suggested replacement to the “old” way of doing this, of course, was to use value-added scores to make “better” decisions about which teachers to fire and whom to keep around, as based on teachers’ causal impacts on students’ “data.”

This week, this case is being appealed, back in Los Angeles (see a recent Education Week article on the appeal here; see also the Students Matter website for daily appeal updates here). This, accordingly, is a very important case to watch, especially as many agree that this case will eventually end up in no lesser than the state’s Supreme Court.

On this note, though, I came across a great article, also in Education Week, this morning, capturing as per the article’s title, the “Five Reasons Vergara Is Still Unwinnable.” I already tweeted this one out, but for those of you not following us on Twitter, I didn’t want you to miss this one.

The author — Charles Taylor Kerchner, Research Professor at Claremont Graduate University — puts the key pieces of the case in context as well as under a fair and appropriate light, more specifically explaining why “this is a case that the plaintiffs can’t win and the defendants will lose regardless of the outcome.” This, in other words and as per his opinion, is a case that will ultimately yield no winners.

Do read Kerchner’s full Education Week piece here, and share out as you see fit. I’ve also copied/pasted the text below (e.g., for those of you who follow via email).


As the trial court arguments concluded in the spring of 2014, one of the first ‘On California’ posts argued that, “from our perspective this is a case that the plaintiffs can’t win and the defendants will lose regardless of the outcome.”  It still is.

Oral arguments on its appeal began last week, a decision is due in 90 days, and an appeal to the state Supreme Court is considered a near certainty.  Just in case you haven’t been listening to the well-oiled noise machine surrounding the case, EdWeek’s Stephen Sawchuk provides a backgrounder.

Teacher Labor Market Realities

First of all, the plaintiffs can’t win this case because they don’t understand—or willfully ignore—the realities of the teacher labor market.  The underlying problem in the supply and demand for teachers is not that young very good teachers were being fired while old sluggish ones held on to their jobs.  As the recent data on teacher shortages shows, the problem is attracting good people to teaching in the first place and holding onto them.  Most young teachers who teach in challenging schools leave because the work is too hard, not because they were laid off. 

If the plaintiffs really want to increase the quality of the teacher work force, then they should put their money behind efforts to forgive student loans or provide residency programs for novice teachers so that they are not dissuaded by the shock of stepping into a classroom without a solid grounding in the practicalities of teaching.

Value Added Testing

Second, accepting Vergara equates to accepting value added testing as a valid means of assessing teacher performance.  Value added testing began as an attempt to substitute achievement gains for the more socially biased “league table” ranking of schools.  Its early advocates used the technique to demonstrate the influence that a good teacher has on a student’s long-term academic progress and economic life chances.  The economists that argued for the Vergara plaintiffs made much of this reasoning.

Unfortunately, , value added systems are usually terrible when they are put in place. The “value” in value-added are nearly always scores on state standardized tests.  Some of these tests are not very good indicators.  For example, nearly all the state tests used by Vergara plaintiffs have been replaced by measures more aligned with the Common Core of state standards.

Most of the tests are only given in a few grades in a few subjects.  Teachers in other grades and subjects get a composite score based on how well the whole school or an entire grade performed, a score that has little to do with that teacher’s value added.

It’s nonsense to use such gross statistical artifacts as the means to dismiss a teacher, or to reward one.  (A Tennessee case featured a teacher who was denied a bonus because his value added scores didn’t make the cut.  He taught largely advanced students, who were not required to take the state tests, and thus his entire value added score rested on one class.)

Disparate Impact

Third, the case accepts the constitutional principle of “disparate impact.”  This evidentiary argument has its origins in housing discrimination cases where it has been held that a law or practice, such as a bank’s lending policy, need not be discriminatory on its face if its impact was unfairly felt. 

If one accepts that people of color are generally discriminated against, and that poor people of color are absolutely discriminated against, then any rule or regulation within the education system is vulnerable to a disparate impact challenge.  Any form of teacher tenure?  Licenses to teach?  A pension system that encourages older teachers to stay instead of making way for young, enthusiastic ones?  School district boundaries?  Civil service protections?  Because all these exist in an inherently discriminatory environment, they would all be vulnerable if Vergara were upheld.

Rich People and Simplistic Solutions

Fourth, Vergara points rich people toward simplistic solutions.  Venture philanthropy is built around the assumption that people with wealth can use their money to disrupt institutions rather than support existing ones.  Students Matter, which is bankrolling the Vergara lawsuit, is a good example. 

It tinkers with three relatively inconsequential aspects of teacher quality while ignoring the much more fundamental changes in teaching and learning that need to take place in order to create a 21st Century education system.

At least as a thought experiment, people with money ought to be required to specify where they are headed.  If public monopoly, which every high performing school system in the world uses to deliver education, is bad, then specify the alternative.  Hiding behind empty phrases such as “grossly incompetent teachers,” derived from a statistical analysis of state test scores, is no substitute for the hard intellectual work of designing a novel education system.

I’m with the so-called reformers in the belief that the education system put in place more than a century ago needs transformation, but certainly those who want to change it should be required to come up with something better than increasing the amount of time it takes to get tenure by 12 months.

Buying Bullets for Your Opponents

Fifth, Vergara has created yet another instance in which the California Teachers Association and the California Federation of Teachers can inflict damage on themselves.  I hope they prevail in this appeal.  They should.  But in winning, they lose.  They will continue to be a target of opportunity by Republicans and an object of scorn among school reformers. 

They have utterly failed to seize the opportunity for policy leadership presented by the lawsuit and the unprecedented but transitory political support they currently enjoy in Sacramento.

Rather than build on strength, a siege mentality has overtaken union leaders, as in “they’re all around us.”  If that’s the case, you’d think that the unions would quit supplying their opponents with ammunition.

I hope the appellate justices overturn Vergara, but regardless, the case will yield no winners.