The New York Times on “The Little Known Statistician” Who Passed

As many of you may recall, I wrote a post last March about the passing of William L. Sanders at age 74. Sanders developed the Education Value-Added Assessment System (EVAAS) — the value-added model (VAM) on which I have conducted most of my research (see, for example, here and here) and the VAM at the core of most of the teacher evaluation lawsuits in which I have been (or still am) engaged (see here, here, and here).

Over the weekend, though, The New York Times released a similar piece about Sanders’s passing, titled “The Little-Known Statistician Who Taught Us to Measure Teachers.” Because I had multiple colleagues and blog followers email me (or email me about) this article, I thought I would share it out with all of you, with some additional comments, of course, but also given the comments I already made in my prior post here.

First, I will start by saying that the title of this article is misleading in that what this “little-known” statistician contributed to the field of education was hardly “little” in terms of its size and impact. Rather, Sanders and his associates at SAS Institute Inc. greatly influenced our nation in terms of the last decade of our nation’s educational policies, as largely bent on high-stakes teacher accountability for educational reform. This occurred in large part due to Sanders’s (and others’) lobbying efforts when the federal government ultimately choose to incentivize and de facto require that all states hold their teachers accountable for their value-added, or lack thereof, while attaching high-stakes consequences (e.g., teacher termination) to teachers’ value-added estimates. This, of course, was to ensure educational reform. This occurred at the federal level, as we all likely know, primarily via Race to the Top and the No Child Left Behind Waivers essentially forced upon states when states had to adopt VAMs (or growth models) to also reform their teachers, and subsequently their schools, in order to continue to receive the federal funds upon which all states still rely.

It should be noted, though, that we as a nation have been relying upon similar high-stakes educational policies since the late 1970s (i.e., for now over 35 years); however, we have literally no research evidence that these high-stakes accountability policies have yielded any of their intended effects, as still perpetually conceptualized (see, for example, Nevada’s recent legislative ruling here) and as still advanced via large- and small-scale educational policies (e.g., we are still A Nation At Risk in terms of our global competitiveness). Yet, we continue to rely on the logic in support of such “carrot and stick” educational policies, even with this last decade’s teacher- versus student-level “spin.” We as a nation could really not be more ahistorical in terms of our educational policies in this regard.

Regardless, Sanders contributed to all of this at the federal level (that also trickled down to the state level) while also actively selling his VAM to state governments as well as local school districts (i.e., including the Houston Independent School District in which teacher plaintiffs just won a recent court ruling against the Sanders value-added system here), and Sanders did this using sets of (seriously) false marketing claims (e.g., purchasing and using the EVAAS will help “clear [a] path to achieving the US goal of leading the world in college completion by the year 2020”). To see two empirical articles about the claims made to sell Sanders’s EVAAS system, the research non-existent in support of each of the claims, and the realities of those at the receiving ends of this system (i.e., teachers) as per their experiences with each of the claims, see here and here.

Hence, to assert that what this “little known” statistician contributed to education was trivial or inconsequential is entirely false. Thankfully, with the passage of the Every Student Succeeds Act” (ESSA) the federal government came around, in at least some ways. While not yet acknowledging how holding teachers accountable for their students’ test scores, while ideal, simply does not work (see the “Top Ten” reasons why this does not work here), at least the federal government has given back to the states the authority to devise, hopefully, some more research-informed educational policies in these regards (I know….).

Nonetheless, may he rest in peace (see also here), perhaps also knowing that his forever stance of “[making] no apologies for the fact that his methods were too complex for most of the teachers whose jobs depended on them to understand,” just landed his EVAAS in serious jeopardy in court in Houston (see here) given this stance was just ruled as contributing to the violation of teachers’ Fourteenth Amendment rights (i.e., no state or in this case organization shall deprive any person of life, liberty, or property, without due process [emphasis added]).

Breaking News: Another Big Victory in Court in Texas

Earlier today I released a post regarding “A Big Victory in Court in Houston,” in which I wrote about how, yesterday, US Magistrate Judge Smith ruled — in Houston Federation of Teachers et al. v. Houston Independent School District — that Houston teacher plaintiffs’ have legitimate claims regarding how their Education Value-Added Assessment System (EVAAS) value-added scores, as used (and abused) in HISD, was a violation of their Fourteenth Amendment due process protections (i.e., no state or in this case organization shall deprive any person of life, liberty, or property, without due process). Hence, on this charge, this case is officially going to trial.

Well, also yesterday, “we” won another court case on which I also served as an expert witness (I served as an expert witness on behalf of the plaintiffs alongside Jesse Rothstein in the court case noted above). As per this case — Texas State Teachers Association v. Texas Education Agency, Mike Morath in his Official Capacity as Commissioner of Education for the State of Texas (although there were three similar cases also filed – see all four referenced below) — The Honorable Lora J. Livingston ruled that the Defendants are to make revisions to 19 Tex. Admin. Code § 150.1001 that most notably include the removal of (A) student learning objectives [SLOs], (B) student portfolios, (C) pre and post test results on district level assessments; or (D) value added data based on student state assessment results. In addition, “The rules do not restrict additional factors a school district may consider…,” and “Under the local appraisal system, there [will be] no required weighting for each measure…,” although districts can chose to weight whatever measures they might choose. “Districts can also adopt an appraisal system that does not provide a single, overall summative rating.” That is, increased local control.

If the Texas Education Agency (TEA) does not adopt the regulations put forth by the court by next October, this case will continue. This does not look likely, however, in that as per a news article released today, here, Texas “Commissioner of Education Mike Morath…agreed to revise the [states’] rules in exchange for the four [below] teacher groups’ suspending their legal challenges.” As noted prior, the terms of this settlement call for the removal of the above-mentioned, state-required, four growth measures when evaluating teachers.

This was also highlighted in a news article, released yesterday, here, with this one more generally about how teachers throughout Texas will no longer be evaluated using their students’ test scores, again, as required by the state.

At the crux of this case, as also highlighted in this particular piece, and to which I testified (quite extensively), was that the value-added measures formerly required/suggested by the state did not constitute teachers’ “observable,” job-related behaviors. See also a prior post about this case here.

*****

Cases Contributing to this Ruling:

1. Texas State Teachers Association v. Texas Education Agency, Mike Morath, in his Official Capacity as Commissioner of Education for the State of Texas; in the 345th Judicial District Court, Travis County, Texas

2. Texas Classroom Teachers Association v. Mike Morath, Texas Commissioner of Education; in the 419th Judicial District Court, Travis County, Texas

3. Texas American Federation of Teachers v. Mike Morath, Commissioner of Education, in his official capacity, and Texas Education Agency; in the 201st Judicial District Court, Travis County, Texas

4. Association of Texas Professional Educators v. Mike Morath, the Commissioner of Education and the Texas Education Agency; in the 200th District Court of Travis County, Texas.

Breaking News: A Big Victory in Court in Houston

Recall from multiple prior posts (see here, here, here, and here) that a set of teachers in the Houston Independent School District (HISD), with the support of the Houston Federation of Teachers (HFT) and the American Federation of Teachers (AFT), took their district to federal court to fight against the (mis)use of their value-added scores, derived via the Education Value-Added Assessment System (EVAAS) — the “original” value-added model (VAM) developed in Tennessee by William L. Sanders who just recently passed away (see here). Teachers’ EVAAS scores, in short, were being used to evaluate teachers in Houston in more consequential ways than anywhere else in the nation (e.g., the termination of 221 teachers in just one year as based, primarily, on their EVAAS scores).

The case — Houston Federation of Teachers et al. v. Houston ISD — was filed in 2014 and just yesterday, United States Magistrate Judge Stephen Wm. Smith denied in the United States District Court, Southern District of Texas, the district’s request for summary judgment given the plaintiffs’ due process claims. Put differently, Judge Smith ruled that the plaintiffs’ did have legitimate claims regarding how EVAAS use in HISD was a violation of their Fourteenth Amendment due process protections (i.e., no state or in this case organization shall deprive any person of life, liberty, or property, without due process). Hence, on this charge, this case is officially going to trial.

This is a huge victory, and one unprecedented that will likely set precedent, trial pending, for others, and more specifically other teachers.

Of primary issue will be the following (as taken from Judge Smith’s Summary Judgment released yesterday): “Plaintiffs [will continue to] challenge the use of EVAAS under various aspects of the Fourteenth Amendment, including: (1) procedural due process, due to lack of sufficient information to meaningfully challenge terminations based on low EVAAS scores,” and given “due process is designed to foster government decision-making that is both fair and accurate.”

Related, and of most importance, as also taken directly from Judge Smith’s Summary, he wrote:

  • HISD’s value-added appraisal system poses a realistic threat to deprive plaintiffs of constitutionally protected property interests in employment.
  • HISD does not itself calculate the EVAAS score for any of its teachers. Instead, that task is delegated to its third party vendor, SAS. The scores are generated by complex algorithms, employing “sophisticated software and many layers of calculations.” SAS treats these algorithms and software as trade secrets, refusing to divulge them to either HISD or the teachers themselves. HISD has admitted that it does not itself verify or audit the EVAAS scores received from SAS, nor does it engage any contractor to do so. HISD further concedes that any effort by teachers to replicate their own scores, with the limited information available to them, will necessarily fail. This has been confirmed by plaintiffs’ expert, who was unable to replicate the scores despite being given far greater access to the underlying computer codes than is available to an individual teacher [emphasis added, as also related to a prior post about how SAS claimed that plaintiffs violated SAS’s protective order (protecting its trade secrets), that the court overruled, see here].
  • The EVAAS score might be erroneously calculated for any number of reasons, ranging from data-entry mistakes to glitches in the computer code itself. Algorithms are human creations, and subject to error like any other human endeavor. HISD has acknowledged that mistakes can occur in calculating a teacher’s EVAAS score; moreover, even when a mistake is found in a particular teacher’s score, it will not be promptly corrected. As HISD candidly explained in response to a frequently asked question, “Why can’t my value-added analysis be recalculated?”:
    • Once completed, any re-analysis can only occur at the system level. What this means is that if we change information for one teacher, we would have to re- run the analysis for the entire district, which has two effects: one, this would be very costly for the district, as the analysis itself would have to be paid for again; and two, this re-analysis has the potential to change all other teachers’ reports.
  • The remarkable thing about this passage is not simply that cost considerations trump accuracy in teacher evaluations, troubling as that might be. Of greater concern is the house-of-cards fragility of the EVAAS system, where the wrong score of a single teacher could alter the scores of every other teacher in the district. This interconnectivity means that the accuracy of one score hinges upon the accuracy of all. Thus, without access to data supporting all teacher scores, any teacher facing discharge for a low value-added score will necessarily be unable to verify that her own score is error-free.
  • HISD’s own discovery responses and witnesses concede that an HISD teacher is unable to verify or replicate his EVAAS score based on the limited information provided by HISD.
  • According to the unrebutted testimony of plaintiffs’ expert, without access to SAS’s proprietary information – the value-added equations, computer source codes, decision rules, and assumptions – EVAAS scores will remain a mysterious “black box,” impervious to challenge.
  • While conceding that a teacher’s EVAAS score cannot be independently verified, HISD argues that the Constitution does not require the ability to replicate EVAAS scores “down to the last decimal point.” But EVAAS scores are calculated to the second decimal place, so an error as small as one hundredth of a point could spell the difference between a positive or negative EVAAS effectiveness rating, with serious consequences for the affected teacher.

Hence, “When a public agency adopts a policy of making high stakes employment decisions based on secret algorithms incompatible with minimum due process, the proper remedy is to overturn the policy.”

Moreover, he wrote, that all of this is part of the violation of teaches’ Fourteenth Amendment rights. Hence, he also wrote, “On this summary judgment record, HISD teachers have no meaningful way to ensure correct calculation of their EVAAS scores, and as a result are unfairly subject to mistaken deprivation of constitutionally protected property interests in their jobs.”

Otherwise, Judge Smith granted summary judgment to the district on the other claims forwarded by the plaintiffs, including plaintiffs’ equal protection claims. All of us involved in the case — recall that Jesse Rothstein and I served as the expert witnesses on behalf of the plaintiffs, and Thomas Kane of the Measures of Effective Teaching (MET) Project and John Friedman of the infamous Chetty et al. studies (see here and here) served as the expert witnesses on behalf of the defendants — knew that all of the plaintiffs’ claims would be tough to win given all of the constitutional legal standards would be difficult for plaintiffs to satisfy (e.g., that evaluating teachers using their value-added scores was not “unreasonable” was difficult to prove, as it was in the Tennessee case we also fought and was then dismissed on similar grounds (see here)).

Nonetheless, that “we” survived on the due process claim is fantastic, especially as this is the first case like this of which we are aware across the country.

Here is the press release, released last night by the AFT:

May 4, 2017 – AFT, Houston Federation of Teachers Hail Court Ruling on Flawed Evaluation System

Statements by American Federation of Teachers President Randi Weingarten and Houston Federation of Teachers President Zeph Capo on U.S. District Court decision on Houston’s Evaluation Value-Added Assessment System (EVAAS), known elsewhere as VAM or value-added measures:

AFT President Randi Weingarten: “Houston developed an incomprehensible, unfair and secret algorithm to evaluate teachers that had no rational meaning. This is the algebraic formula: = + (Σ∗≤Σ∗∗ × ∗∗∗∗=1)+

“U.S. Magistrate Judge Stephen Smith saw that it was seriously flawed and posed a threat to teachers’ employment rights; he rejected it. This is a huge victory for Houston teachers, their students and educators’ deeply held contention that VAM is a sham.

“The judge said teachers had no way to ensure that EVAAS was correctly calculating their performance score, nor was there a way to promptly correct a mistake. Judge Smith added that the proper remedy is to overturn the policy; we wholeheartedly agree. Teaching must be about helping kids develop the skills and knowledge they need to be prepared for college, career and life—not be about focusing on test scores for punitive purposes.”

HFT President Zeph Capo: “With this decision, Houston should wipe clean the record of every teacher who was negatively evaluated. From here on, teacher evaluation systems should be developed with educators to ensure that they are fair, transparent and help inform instruction, not be used as a punitive tool.”

New Texas Lawsuit: VAM-Based Estimates as Indicators of Teachers’ “Observable” Behaviors

Last week I spent a few days in Austin, one day during which I provided expert testimony for a new state-level lawsuit that has the potential to impact teachers throughout Texas. The lawsuit — Texas State Teachers Association (TSTA) v. Texas Education Agency (TEA), Mike Morath in his Official Capacity as Commissioner of Education for the State of Texas.

The key issue is that, as per the state’s Texas Education Code (Sec. § 21.351, see here) regarding teachers’ “Recommended Appraisal Process and Performance Criteria,” The Commissioner of Education must adopt “a recommended teacher appraisal process and criteria on which to appraise the performance of teachers. The criteria must be based on observable, job-related behavior, including: (1) teachers’ implementation of discipline management procedures; and (2) the performance of teachers’ students.” As for the latter, the State/TEA/Commissioner defined, as per its Texas Administrative Code (T.A.C., Chapter 15, Sub-Chapter AA, §150.1001, see here), that teacher-level value-added measures should be treated as one of the four measures of “(2) the performance of teachers’ students;” that is, one of the four measures recognized by the State/TEA/Commissioner as an “observable” indicator of a teacher’s “job-related” performance.

While currently no district throughout the State of Texas is required to use a value-added component to assess and evaluate its teachers, as noted, the value-added component is listed as one of four measures from which districts must choose at least one. All options listed in the category of “observable” indicators include: (A) student learning objectives (SLOs); (B) student portfolios; (C) pre- and post-test results on district-level assessments; and (D) value-added data based on student state assessment results.

Related, the state has not recommended or required that any district, if the value-added option is selected, to choose any particular value-added model (VAM) or calculation approach. Nor has it recommended or required that any district adopt any consequences as attached to these output; however, things like teacher contract renewal and sharing teachers’ prior appraisals with other districts in which teachers might be applying for new jobs is not discouraged. Again, though, the main issue here (and the key points to which I testified) was that the value-added component is listed as an “observable” and “job-related” teacher effectiveness indicator as per the state’s administrative code.

Accordingly, my (5 hour) testimony was primarily (albeit among many other things including the “job-related” part) about how teacher-level value-added data do not yield anything that is observable in terms of teachers’ effects. Likewise, officially referring to these data in this way is entirely false, in fact, in that:

  • “We” cannot directly observe a teacher “adding” (or detracting) value (e.g., with our own eyes, like supervisors can when they conduct observations of teachers in practice);
  • Using students’ test scores to measure student growth upwards (or downwards) and over time, as is very common practice using the (very often instructionally insensitive) state-level tests required by No Child Left Behind (NCLB), and doing this once per year in mathematics and reading/language arts (that includes prior and other current teachers’ effects, summer learning gains and decay, etc.), is not valid practice. That is, doing this has not been validated by the scholarly/testing community; and
  • Worse and less valid is to thereafter aggregate this student-level growth to the teacher level and then call whatever “growth” (or the lack thereof) is because of something the teacher (and really only the teacher did), as directly “observable.” These data are far from assessing a teacher’s causal or “observable” impacts on his/her students’ learning and achievement over time. See, for example, the prior statement released about value-added data use in this regard by the American Statistical Association (ASA) here. In this statement it is written that: “Research on VAMs has been fairly consistent that aspects of educational effectiveness that are measurable and within teacher control represent a small part of the total variation [emphasis added to note that this is variation explained which = correlational versus causal research] in student test scores or growth; most estimates in the literature attribute between 1% and 14% of the total variability [emphasis added] to teachers. This is not saying that teachers have little effect on students, but that variation among teachers [emphasis added] accounts for a small part of the variation [emphasis added] in [said test] scores. The majority of the variation in [said] test scores is [inversely, 86%-99% related] to factors outside of the teacher’s control such as student and family background, poverty, curriculum, and unmeasured influences.”

If any of you have anything to add to this, please do so in the comments section of this post. Otherwise, I will keep you posted on how this goes. My current understanding is that this one will be headed to court.

New Mexico Lawsuit Update

The ongoing lawsuit in New Mexico has, once again (see here and here), been postponed to October of 2017 due to what are largely (and pretty much only) state (i.e., Public Education Department (PED)) delays. Whether the delays are deliberate are uncertain but being involved in this case… The (still) good news is that the preliminary injunction granted to teachers last fall (see here) still holds so that teachers cannot (or are not to) face consequences as based on the state’s teacher evaluation system.

For more information, this is the email the American Federation of Teachers – New Mexico (AFT NM) and the Albuquerque Teachers Federation (ATF) sent out to all of their members yesterday:

Yesterday, both AFT NM/ATF and PED returned to court to address the ongoing legal battle against the PED evaluation system. Our lawyer proposed that we set a court date ASAP. The PED requested a date for next fall citing their busy schedule as the reason. As a result, the court date is now late October 2017.

While we are relieved to have a final court date set, we are dismayed at the amount of time that our teachers have to wait for the final ruling.

In a statement to the press, ATF President Ellen Bernstein reflected on the current state of our teachers in regards to the evaluation system, “Even though they know they can’t be harmed in their jobs right now, it bothers them in the core of their being, and nothing I can say can take that away…It’s a cloud over everybody.”

AFT NM President Stephanie Ly, said, “It is a shame our educators still don’t have a legitimate evaluation system. The PED’s previous abusive evaluation system was thankfully halted through an injunction by the New Mexico courts, and the PED has yet to create an evaluation system that uniformly and fairly evaluates educators, and have shown no signs to remedy this situation. The PED’s actions are beyond the pale, and it is simply unjust.”

While we await trial, we want to thank our members who sent in their evaluation data to help our case. Remind your colleagues that they may still advance in licensure by completing a dossier; the PED’s arbitrary rating system cannot negatively affect a teacher’s ability to advance thanks to the injunction won by AFT NM/ATF last fall. That injunction will stay in place until a ruling is issued on our case next October.

In Solidarity,

Stephanie Ly

New Mexico Lawsuit Update

As you all likely recall, the American Federation of Teachers (AFT), joined by the Albuquerque Teachers Federation (ATF), last fall, filed a “Lawsuit in New Mexico Challenging [the] State’s Teacher Evaluation System.” Plaintiffs charged that the state’s teacher evaluation system, imposed on the state in 2012 by the state’s current Public Education Department (PED) Secretary Hanna Skandera (with value-added counting for 50% of teachers’ evaluation scores), was unfair, error-ridden, spurious, harming teachers, and depriving students of high-quality educators, among other claims (see the actual lawsuit here). Again, I’m serving as the expert witness on the side of the plaintiffs in this suit.

As you all likely also recall, in December of 2015, State District Judge David K. Thomson granted a preliminary injunction preventing consequences from being attached to the state’s teacher evaluation data. More specifically, Judge Thomson ruled that the state could proceed with “developing” and “improving” its teacher evaluation system, but the state was not to make any consequential decisions about New Mexico’s teachers using the data the state collected until the state (and/or others external to the state) could evidence to the court during another trial (initially set for April 2016, then postponed to October 2016, and likely to be postponed again) that the system is reliable, valid, fair, uniform, and the like (see prior post on this ruling here).

Well, many of you have (since these prior posts) written requesting updates regarding this lawsuit, and here is one as released jointly by the AFT and ATF. This accurately captures the current and ongoing situation:

September 23, 2016

Many of you will remember the classic Christmas program, Rudolph the Red Nose Reindeer, and how the terrible and menacing abominable snowman became harmless once his teeth were removed. This is how you should view the PED evaluation you recently received – a harmless abominable snowman.  

The math is still wrong, the methodology deeply flawed, but the preliminary injunction achieved by our union, removed the teeth from PED’s evaluations, and so there is no reason to worry. As explained below, we will continue to fight these evaluations and will not rest until the PED institutes an evaluation system that is fair, meaningful, and consistently applied.

For all of you, who just got arbitrarily labeled by the PED in your summative evaluations, just remember, like the abominable snowman, these labels have no teeth, and your career is safe.

2014-2015 Evaluations

These evaluations, as you know, were the subject of our lawsuit filed in 2014. As a result of the Court’s order, the preliminary injunction, no negative consequences can result from your value-added scores.

In an effort to comply with the Court’s order, the PED announced in May it would be issuing new regulations.  This did not happen, and it did not happen in June, in July, in August, or in September. The bottom line is the PED still has not issued new regulations – though it still promises that those regulations are coming soon. So much for accountability.

The trial on the old regulations, scheduled for October 24, has been postponed based upon the PED’s repetitive assertions that new regulations would be issued.

In addition, we have repeatedly asked the PED to provide their data, which they finally did, however it lacked the codebook necessary to meaningfully interpret the data. We view this as yet another stall tactic.

Soon, we will petition the Court for an order compelling PED to produce the documents it promised months ago. Our union’s lawyers and expert witnesses will use this data to critically analyze the PED’s claims and methodology … again.

2015-2016 Evaluations

Even though the PED has condensed the number of ways an educator can be evaluated in a false attempt to satisfy the Courts, the fact remains that value-added models are based on false math and highly inaccurate data. In addition to the PED’s information we have requested for the 2014-2015 evaluations, we have requested all data associated with the current 2015-2016 evaluations.

If our experts determine the summative evaluation scores are again, “based on fundamentally, and irreparably, flawed methodology which is further plagued by consistent and appalling data errors,” we will also challenge the 2015-2016 evaluations. If the PED ever releases new regulations, and we determine that they violate statute (again), we will challenge those regulations, as well.

Rest assured our union will not stop challenging the PED until we are satisfied they have adopted an evaluation system that is respectful of students and educators. We will keep you updated as we learn more information, including the release of new regulations and the rescheduled trial date.

In Solidarity,

Stephanie Ly                                   Ellen Bernstein
President, AFT NM                         President, ATF

A New Book about VAMs “On Trial”

I recently heard about a new book that was written by Mark Paige — J.D. and Ph.D., assistant professor of public policy at the University of Massachusetts-Dartmouth, and a former school law attorney — and published by Rowman & Littlefield. The book is about, as per the secondary part of its title “Understanding Value-Added Models [VAMs] in the Law of Teacher Evaluation.” See more on this book, including information about how to purchase it, for those of you who might be interested in reading more, here, and also via Amazon here.

Clearly, this book is to prove very relevant given the ongoing court cases across the country (see a prior post on these cases here) regarding teachers and the systems being used to evaluate them when especially (or extremely) reliant upon VAM-based estimates for consequential decision-making purposes (e.g., teacher tenure, pay, and termination). While I have not yet read the book, I just ordered my copy the other day. I suggest you do the same, again, should you be interested in further or better understanding the federal and state law pertinent to these cases.

Notwithstanding, I also requested that the author of this book — Mark Paige — write a guest post so that you too could find out more. Here is what he wrote:

Many of us have been following VAMs in legal circles. Several courts have faced the issue of VAMs as they relate to employment law matters. These cases have tested a chief selling point (pardon [or underscore] the business reference) of VAMs: that they will effectuate, for example, teacher termination with greater ease because nobody besides the advanced statisticians and econometricians can argue with their numbers derived. In other words, if a teacher’s VAM rating is bad, then the teacher must be bad. It’s to be as simple as that. How can a court deny that, reality?

Of course, as we [should] already know, VAMs are anything but certain. Bluntly stated: VAMs are a statistical “hot mess.” The American Statistical Association, among many others, warned in no uncertain terms that VAMs cannot – and should not – be trusted to make significant employment decisions. Of course, that has not stopped many policymakers from a full-throated adoption of their use in employment and evaluation decisions. Talk about hubris.

Accordingly, I recently completed this book, again, that focuses squarely at the intersection of VAMs and the law. Its full title is “Building a Better Teacher: Understanding Value-Added Models in the Law of Teacher Evaluation” Rowman & Littlefield, 2016). Again, I provide a direct link to the book along with its description here.

To offer a bit of a sneak preview, thought, I draw many conclusions throughout the book, but one of two important take-aways is this: VAMs may actually complicate the effectuation of a teacher’s termination. Here’s one way: because VAMs are so statistically infirm, they invite plaintiff-side attorneys to attack any underlying negative decision based on these models. See, for example, Sheri Lederman’s recent New York State Supreme Court’s decision, here. [See also a related post in this blog here].

In other words, the evidence upon which districts or states rely to make significant decisions is untrustworthy (or arbitrary) and, therefore, so is any decision as based, even if in part, on VAMs. Thus, VAMs may actually strengthen a teacher’s case. This, of course, is quite apart from the fact that VAM use results in firing good teachers based on poor information, thereby contributing to the teacher shortages and lower morale (among many other parades of horribles) being reported across the nation, and now more than likely ever.

The second important take-away is this, especially given followers of this blog include many educators and administrators facing a barrage of criticisms that only “de-professionalize” them: Courts have, over time, consistently deferred to the professional judgment of administrators (and their assessment of effective teaching). The members of that august institution – the judiciary – actually believe that educators know best about teaching, and that years of accumulated experience and knowledge have actual and also court-relevant value. That may come as a startling revelation to those who consistently diminish the education profession, or those who at least feel like they and their efforts are consistently being diminished.

To be sure, the system of educator evaluation is not perfect. Our schools continue to struggle to offer equal and equitable educational opportunities to all students, especially those in the nation’s highest needs schools. But what this book ultimately concludes is that the continued use of VAMs will not, hu-hum, add any value to these efforts.

To reach author Mark Paige via email, please contact him at mpaige@umassd.edu. To reach him via Twitter: @mpaigelaw

No More EVAAS for Houston: School Board Tie Vote Means Non-Renewal

Recall from prior posts (here, here, and here) that seven teachers in the Houston Independent School District (HISD), with the support of the Houston Federation of Teachers (HFT), are taking HISD to federal court over how their value-added scores, derived via the Education Value-Added Assessment System (EVAAS), are being used, and allegedly abused, while this district that has tied more high-stakes consequences to value-added output than any other district/state in the nation. The case, Houston Federation of Teachers, et al. v. Houston ISD, is ongoing.

But just announced is that the HISD school board, in a 3:3 split vote late last Thursday night, elected to no longer pay an annual $680K to SAS Institute Inc. to calculate the district’s EVAAS value-added estimates. As per an HFT press release (below), HISD “will not be renewing the district’s seriously flawed teacher evaluation system, [which is] good news for students, teachers and the community, [although] the school board and incoming superintendent must work with educators and others to choose a more effective system.”

here

Apparently, HISD was holding onto the EVAAS, despite the research surrounding the EVAAS in general and in Houston, in that they have received (and are still set to receive) over $4 million in federal grant funds that has required them to have value-added estimates as a component of their evaluation and accountability system(s).

While this means that the federal government is still largely in favor of the use of value-added model (VAMs) in terms of its funding priorities, despite their prior authorization of the Every Student Succeeds Act (ESSA) (see here and here), this also means that HISD might have to find another growth model or VAM to still comply with the feds.

Regardless, during the Thursday night meeting a board member noted that HISD has been kicking this EVAAS can down the road for 5 years. “If not now, then when?” the board member asked. “I remember talking about this last year, and the year before. We all agree that it needs to be changed, but we just keep doing the same thing.” A member of the community said to the board: “VAM hasn’t moved the needle [see a related post about this here]. It hasn’t done what you need it to do. But it has been very expensive to this district.” He then listed the other things on which HISD could spend (and could have spent) its annual $680K EVAAS estimate costs.

Soon thereafter, the HISD school board called for a vote, and it ended up being a 3-3 tie. Because of the 3-3 tie vote, the school board rejected the effort to continue with the EVAAS. What this means for the related and aforementioned lawsuit is still indeterminate at this point.

“Arbitrary and Capricious:” Sheri Lederman Wins Lawsuit in NY’s State Supreme Court

Recall the New York lawsuit pertaining to Long Island teacher Sheri Lederman? She just won in New York’s State Supreme court, and boy did she win big, also for the cause!

Sheri is a teacher, who by all accounts other than her 2013-2014 “ineffective” growth score of a 1/20, is a terrific 4th grade, 18-year veteran teacher. However, after receiving her “ineffective” growth rating and score, she along with her attorney and husband Bruce Lederman, sued the state of New York to challenge the state’s growth-based teacher evaluation system and Sheri’s individual score. See prior posts about Sheri’s case here, herehere and here.

The more specific goal of her case was to seek a judgment: (1) setting aside or vacating Sheri’s individual growth score and rating her as “ineffective,” and (2) declare that the New York endorsed and implemented growth measures in use was/is “arbitrary and capricious.” The “overall gist” was that Sheri contended that the system unfairly penalized teachers whose students consistently scored well and could not demonstrated growth upwards (e.g., teachers of gifted or other high achieving students). This concern/complaint is common elsewhere.

As per a State Supreme Court ruling, just released today as written by Acting Supreme Court Justice Judge Roger McDonough (May 10, 2016), and at 15 pages in length and available in full here, Sheri won her case. She won it against John King — the then New York State Education Department Commissioner and the now US Secretary of Education (who recently replaced Arne Duncan as US Secretary of Education). The Court concluded that Sheri (her husband, her team of experts, and other witnesses) effectively established that her growth score and rating for 2013-2014 was “arbitrary and capricious,” with “arbitrary and capricious” being defined as actions “taken without sound basis in reason or regard to the facts.”

More specifically, the Court’s conclusion was founded upon: (1) the convincing and detailed evidence of VAM bias against teachers at both ends of the spectrum (e.g. those with high-performing students or those with low-performing students); (2) the disproportionate effect of petitioner’s small class size and relatively large percentage of high-performing students; (3) the functional inability of high-performing students to demonstrate growth akin to lower-performing students; (4) the wholly unexplained swing in petitioner’s growth score from 14 [i.e., her growth score the year prior] to 1, despite the presence of statistically similar scoring students in her respective classes; and, most tellingly, (5) the strict imposition of rating constraints in the form of a “bell curve” that places teachers in four categories via pre-determined percentages regardless of whether the performance of students dramatically rose or dramatically fell from the previous year.”

As per an email I received earlier today from Bruce (i.e., Sheri’s husband/attorney who prosecuted her case), the Court otherwise “declined to make an overall ruling on the [New York growth] rating system in general because of new regulations in effect” [e.g., that the state’s growth model is currently under review]…[Nontheless, t]he decision should qualify as persuasive authority for other teachers challenging growth scores throughout the County [and Country]. [In addition, the] Court carefully recite[d] all our expert affidavits [i.e., from Professors Darling-Hammond, Pallas, Amrein-Beardsley, Sean Corcoran and Jesse Rothstein as well as Drs. Burris and Lindell].” Noted as well were the “absence of any meaningful’ challenge to [Sheri’s] experts’ conclusions, especially about the dramatic swings noticed between her, and potentially others’ scores, and the other ‘litany of expert affidavits submitted on [Sheris’] behalf].”

“It is clear that the evidence all of these amazing experts presented was a key factor in winning this case since the Judge repeatedly said both in Court and in the decision that we have a “high burden” to meet in this case.” [In addition,] [t]he Court wrote that the court “does not lightly enter into a critical analysis of this matter … [and] is constrained on this record, to conclude that [the] petitioner [i.e., Sheri] has met her high burden.”

To Bruce’s/our knowledge, this is the first time a judge has set aside an individual teacher’s VAM rating based upon such a presentation in court.

Thanks to all who helped in this endeavor. Onward!

Virginia SGP’s Side of the Story

In one of my most recent posts I wrote about how Virginia SGP, aka parent Brian Davison, won in court against the state of Virginia, requiring them to release teachers’ Student Growth Percentile (SGP) scores. Virginia SGP is a very vocal promoter of the use of SGPs to evaluate teachers’ value-added (although many do not consider the SGP model to be a value-added model (VAM); see general differences between VAMs and SGPs here). Regardless, he sued the state of Virginia to release teachers’ SGP scores so he could make them available to all via the Internet. He did this, more specifically, so parents and perhaps others throughout the state would be able to access and then potentially use the scores to make choices about who should and should not teach their kids. See other posts about this story here and here.

Those of us who are familiar with Virginia SGP and the research literature writ large know that, unfortunately, there’s much that Virginia SGP does not understand about the now loads of research surrounding VAMs as defined more broadly (see multiple research article links here). Likewise, Virginia SGP, as evidenced below, rides most of his research-based arguments on select sections of a small handful of research studies (e.g., those written by economists Raj Chetty and colleagues, and Thomas Kane as part of Kane’s Measures of Effective Teaching (MET) studies) that do not represent the general research on the topic. He simultaneously ignores/rejects the research studies that empirically challenge his research-based claims (e.g., that there is no bias in VAM-based estimates, and that because Chetty, Friedman, and Rockoff “proved this,” it must be true, despite the research studies that have presented evidence otherwise (see for example here, here, and here).

Nonetheless, given that him winning this case in Virginia is still noteworthy, and followers of this blog should be aware of this particular case, I invited Virginia SGP to write a guest post so that he could tell his side of the story. As we have exchanged emails in the past, which I must add have become less abrasive/inflamed as time has passed, I recommend that readers read and also critically consume what is written below. Let’s hope that we might have some healthy and honest dialogue on this particular topic in the end.

From Virginia SGP:

I’d like to thank Dr. Amrein-Beardsley for giving me this forum.

My school district recently announced its teacher of the year. John Tuck teaches in a school with 70%+ FRL students compared to a district average of ~15% (don’t ask me why we can’t even those #’s out). He graduated from an ordinary school with a degree in liberal arts. He only has a Bachelors and is not a National Board Certified Teacher (NBCT). He is in his ninth year of teaching specializing in math and science for 5th graders. Despite the ordinary background, Tuck gets amazing student growth. He mentors, serves as principal in the summer, and leads the school’s leadership committees. In Dallas, TX, he could have risen to the top of the salary scale already, but in Loudoun County, VA, he only makes $55K compared to a top salary of $100K for Step 30 teachers. Tuck is not rewarded for his talent or efforts largely because Loudoun eschews all VAMs and merit-based promotion.

This is largely why I enlisted the assistance of Arizona State law school graduate Lin Edrington in seeking the Virginia Department of Education’s (VDOE) VAM (SGP) data via a Freedom of Information Act (FOIA) suit (see pertinent files here).

VAMs are not perfect. There are concerns about validity when switching from paper to computer tests. There are serious concerns about reliability when VAMs are computed with small sample sizes or are based on classes not taught by the rated teacher (as appeared to occur in New Mexico, Florida, and possibly New York). Improper uses of VAMs give reformers a bad name. This was not the case in Virginia. SGPs were only to be used when appropriate with 2+ years of data and 40+ scores recommended.

I am a big proponent of VAMs based on my reviews of the research. We have the Chetty/Friedman/Rockoff (CFR) studies, of course, including their recent paper showing virtually no bias (Table 6). The following briefing presented by Professor Friedman at our trial gives a good layman’s overview of their high level findings. When teachers are transferred to a completely new school but their VAMs remain consistent, that is very convincing to me. I understand some point to the cautionary statement of the ASA suggesting districts apply VAMs carefully and explicitly state their limitations. But the ASA definitely recommends VAMs for analyzing larger samples including schools or district policies, and CFR believe their statement failed to consider updated research.

To me, the MET studies provided some of the most convincing evidence. Not only are high VAMs on state standardized tests correlated to higher achievement on more open-ended short-answer and essay-based tests of critical thinking, but students of high-VAM teachers are more likely to enjoy class (Table 14). This points to VAMs measuring inspiration, classroom discipline, the ability to communicate concepts, subject matter knowledge and much more. If a teacher engages a disinterested student, their low scores will certainly rise along with their VAMs. CFR and others have shown this higher achievement carries over into future grades and success later in life. VAMs don’t just measure the ability to identify test distractors, but the ability of teachers to inspire.

So why exactly did the Richmond City Circuit Court force the release of Virginia’s SGPs? VDOE applied for and received a No Child Left Behind (NCLB) waiver like many other states. But in court testimony provided in December of 2014, VDOE acknowledged that districts were not complying with the waiver by not providing the SGP data to teachers or using SGPs in teacher evaluations despite “assurances” to the US Department of Education (USDOE). When we initially received a favorable verdict in January of 2015, instead of trying to comply with NCLB waiver requirements, my district of Loudoun County Publis Schools (LCPS) laughed. LCPS refused to implement SGPs or even discuss them.

There was no dispute that the largest Virginia districts had committed fraud when I discussed these facts with the US Attorney’s office and lawyers from the USDOE in January of 2016, but the USDOE refused to support a False Claim Act suit. And while nearly every district stridently refused to use VAMs [i.e., SGPs], the Virginia Secretary of Education was falsely claiming in high profile op-eds that Virginia was using “progress and growth” in the evaluation of schools. Yet, VDOE never used the very measure (SGPs) that the ESEA [i.e., NCLB] waivers required to measure student growth. The irony is that if these districts had used SGPs for just 1% of their teachers’ evaluations after the December of 2014 hearing, their teachers’ SGPs would be confidential today. I could only find one county that utilized SGPs, and their teachers’ SGPs are exempt. Sometimes fraud doesn’t pay.

My overall goals are threefold:

  1. Hire more Science Technology Engineering and Mathematics (STEM) majors to get kids excited about STEM careers and effectively teach STEM concepts
  2. Use growth data to evaluate policies, administrators, and teachers. Share the insights from the best teachers and provide professional development to ineffective ones
  3. Publish private sector equivalent pay so young people know how much teachers really earn (pensions often add 15-18% to their salaries). We can then recruit more STEM teachers and better overall teaching candidates

What has this lawsuit and activism cost me? A lot. I ate $5K of the cost of the VDOE SGP suit even after the award[ing] of fees. One local school board member has banned me from commenting on his “public figure” Facebook page (which I see as a free speech violation), both because I questioned his denial of SGPs and some other conflicts of interests I saw, although indirectly related to this particular case. The judge in the case even sanctioned me $7K just for daring to hold him accountable. And after criticizing LCPS for violating Family Educational Rights and Privacy Act (FERPA) by coercing kids who fail Virginia’s Standards of Learning tests (SOLs) to retake them, I was banned from my kids’ school for being a “safety threat.”

Note that I am a former Naval submarine officer and have held Department of Defense (DOD) clearances for 20+ years. I attended a meeting this past Thursday with LCPS officials in which they [since] acknowledged I was no safety threat. I served in the military, and along with many I have fought for the right to free speech.

Accordingly, I am no shrinking violet. Despite having LCPS attorneys sanction perjury, the Republican Commonwealth Attorney refused to prosecute and then illegally censored me in public forums. So the CA will soon have to sign a consent order acknowledging violating my constitutional rights (he effectively admitted as much already). And a federal civil rights complaint against the schools for their retaliatory ban is being drafted as we speak. All of this resulted from my efforts to have public data released and hold LCPS officials accountable to state and federal laws. I have promised that the majority of any potential financial award will be used to fund other whistle blower cases, [against] both teachers and reformers. I have a clean background and administrators still targeted me. Imagine what they would do to someone who isn’t willing to bear these costs!

In the end, I encourage everyone to speak out based on your beliefs. Support your case with facts not anecdotes or hastily conceived opinions. And there are certainly efforts we can all support like those of Dr. Darling-Hammond. We can hold an honest debate, but please remember that schools don’t exist to employ teachers/principals. Schools exist to effectively educate students.