“Virginia SGP” Wins in Court Against State

Virginia SGP, also known as Brian Davison — a parent of two public school students in the affluent Loudoun, Virginia area (hereafter referred to as Virginia SGP) — has been an avid (and sometimes abrasive) commentator about value-added models (VAMs), defined generically, on this blog (see, for example, here, here, and here), on Diane Ravitch’s blog (see, for example, here, here, and here), and elsewhere (e.g., Virginia SGP’s Facebook page here). He is an advocate and promoter of the use of VAMs (which are in this particular case Student Growth Percentiles (SGPs); see differences between VAMs and SGPs here and here) to evaluate teachers, and he is an advocate and promoter of the release of teachers’ SGP scores to parents and the general public for their consumption and use.

Related, and as described in a Washington Post article published in March of 2016, Virginia SGP “…Pushed [Virginia] into Debate of Teacher Privacy vs. Transparency for Parents” as per teachers’ SPG data. This occurred via a lawsuit Virginia SGP filed against the state, attempting to force the release of teachers’ SGP data for all teachers across the state. More specifically, and akin to what happened in 2010 when the Los Angeles Times published the names and VAM-based ratings of thousands of teachers teaching in the Los Angeles Unified School District (LAUSD), Virginia SGP “pressed for the data’s release because he thinks parents have a right to know how their children’s teachers are performing, information about public employees that exists but has so far been hidden. He also wants to expose what he says is Virginia’s broken promise to begin using the data to evaluate how effective the state’s teachers are.” He thinks that “teacher data should be out there,” especially if taxpayers are paying for it.

In January, a Richmond, Virginia judge ruled in Virginia SGP’s favor, despite the state’s claims that Virginia school districts, despite the state’s investments, had reportedly not been using the SGP data, “calling them flawed and unreliable measures of a teacher’s effectiveness.” And even though this ruling was challenged by state officials and the Virginia Education Association thereafter, Virginia SGP posted via his Facebook page the millions of student records the state released in compliance with the court, with teacher names and other information redacted.

This past Tuesday, however, and despite the challenges to the court’s initial ruling, came another win for Virginia SGP, as well as another loss for the state of Virginia. See the article “Judge Sides with Loudoun Parent Seeking Teachers’ Names, Student Test Scores,” published yesterday in a local Loudon, Virginia news outlet.

The author of this article, Danielle Nadler, explains more specifically that, “A Richmond Circuit Court judge has ruled that [the] VDOE [Virginia Department of Education] must release Loudoun County Public Schools’ Student Growth Percentile [SGP] scores by school and by teacher…[including] teacher identifying information.” The judge noted that “that VDOE and the Loudoun school system failed to ‘meet the burden of proof to establish an exemption’ under Virginia’s Freedom of Information Act [FOIA].” The court also ordered VDOE to pay Davison $35,000 to cover his attorney fees and other costs. This final order was dated April 12, 2016.

“Davison said he plans to publish the information on his ‘Virginia SGP’ Facebook page. Students will not be identified, but some of the teachers will. ‘I may mask the names of the worst performers when posting rankings/lists but other members of the public can analyze the data themselves to discover who those teachers are,” Virginia SGP said.

I’ve exchanged messages with Virginia SGP prior to this ruling and since, and since I’ve explicitly invited him to also comment via this blog. While with this objective and subsequent ruling I disagree, although I do believe in transparency, it is nonetheless newsworthy in the realm of VAMs and for followers/readers of this blog. Comment now and/or do stay tuned for more.

The “Vergara v. California” Decision Reversed: Another (Huge) Victory in Court

In June of 2014, defendants in “Vergara v. California” in Los Angeles, California lost their case. As a reminder, plaintiffs included nine public school students (backed by some serious corporate reformer funds as per Students Matter) who challenged five California state statutes that supported the state’s “ironclad [teacher] tenure system.” The prosecution’s argument was that students’ rights to a good education were being violated by teachers’ job protections…protections that were making it too difficult to fire “grossly ineffective” teachers. The prosecution’s suggested replacement to the “old” way of doing this, of course, was to use value-added scores to make “better” decisions about which teachers to fire and whom to keep around.

In February of 2016, “Vergara v. California” was appealed, back in Los Angeles.

Released, yesterday, was the Court of Appeal’s decision reversing the trial court’s earlier decision. As per an email I received also yesterday from one of the lawyers involved, “The unanimous decision holds that the plaintiffs did not establish their equal protection claim because they did not show that the challenged [“ironclad” tenure] laws themselves cause harm to poor students or students of color.” Accordingly, the Court of Appeal “ordered that judgment be entered for the defendants (the state officials and teachers’ unions)…[and]…this should end the case, and copycat cases in other parts of the country [emphasis added].” However, plaintiffs have already announced their intent to appeal this ruling to the California Supreme Court.

Please find attached here, as certified for publication, the actual Court of Appeal decision. See also a post here about this reversal authored by California teachers’ unions. See also here more information released by the California Teachers Association.

See also the amicus brief that a large set of deans and professors across the country contributed to/signed to help in this reversal.

Victory in New Mexico’s Lawsuit, Again

My most recent post about the state of New Mexico (here) included an explanation of a New Mexico Judge’s ruling to postpone New Mexico’s state-wide teacher evaluation trial until October 2016, with the state’s December 2015 preliminary injunction (described here) in place until (at least) then.

New Mexico’s Public Education Department (PED) recently, however, also tried to appeal the Judge’s October 2016 injunction, and took it to New Mexico’s Court of Appeals for an emergency review of the Judge’s injunction order.

The state and its PED lost, again. Here is the court order, which essentially says that the appeal was denied, and pasted below is the press release, released by the American Federation of Teachers New Mexico and Albuquerque Teachers Federation (i.e., the plaintiffs in this case).

Also here is an article just released in the Santa Fe New Mexican about this ruling, also about how the “Appeals court reject[ed the state’s] request to intervene in [this] teacher evaluation case.”

PRESS RELEASE, FOR IMMEDIATE RELEASE

Court Denies Request from Public Education Department; Keeps Case in District Court

March 16, 2016

Contact: John Dyrcz
505-554-8679

Albuquerque – American Federation of Teachers New Mexico (AFT NM) President Stephanie Ly and Albuquerque Teachers Federation (ATF) President Ellen Bernstein released the following statement:

“We are not surprised by today’s decision of the New Mexico Court of Appeals denying the New Mexico Public Education Department’s request for an interlocutory – or emergency – review of District Court Judge David Thomson’s injunction order. The December 2015 injunction preventing the PED from using its faulty evaluation system to penalize educators was well reasoned and the product of a fair and lengthy series of hearings over four months.

“We have maintained throughout this process that while the PED has every right to pursue all legal options under our judicial system, these frequent attempts at disrupting the progress of this case are nothing more than an attempt to stall the momentum of our efforts to seek relief for New Mexico’s education community.

“With this order, the case returns to Judge Thomson for final testimony from our expert witnesses, and we are pleased that the temporary injunction granted in December of 2015 will remain in place until at least October of 2016, when AFT NM and ATF will seek to make the injunction permanent,” said Ly and Bernstein.

Alleged Violation of Protective Order in Houston Lawsuit, Overruled

Many of you will recall a post I made public in January including “Houston Lawsuit Update[s], with Summar[ies] of Expert Witnesses’ Findings about the EVAAS” (Education Value-Added Assessment System sponsored by SAS Institute Inc.). What you might not have recognized since, however, was that I pulled the post down a few weeks after I posted it. Here’s the back story.

In January 2016, the Houston Federation of Teachers (HFT) published an “EVAAS Litigation Update,” which summarized a portion of Dr. Jesse Rothstein’s expert report in which he conclude[d], among other things, that teachers do not have the ability to meaningfully verify their EVAAS scores. He wrote that “[a]t most, a teacher could request information about which students were assigned to her, and could read literature — mostly released by SAS, and not the product of an independent investigation — regarding the properties of EVAAS estimates.” On January 10, 2016, I posted the post: “Houston Lawsuit Update, with Summary of Expert Witnesses’ Findings about the EVAAS” summarizing what I considered to be the twelve key highlights of HFT’s “EVAAS Litigation Update,” in which I highlighted Rothstein’s above conclusions.

Lawyers representing SAS Institute Inc. charged that this post, along with the more detailed “EVAAS Litigation Update” I summarized within the post (authored by the Houston Federation of Teachers (HFT) to keep their members in Houston up-to-date on the progress of this lawsuit) violated a protective order that was put in place to protect SAS’s EVAAS computer source code. Even though there is/was nothing in the “EVAAS Litigation Update” or the blog post that disclosed the source code, SAS objected to both as disclosing conclusions that, SAS said, could not have been reached in the absence of a review of the source code. They threatened HFT, its lawyers, and its experts (myself and Dr. Rothstein) with monetary sanctions. HFT went to court in order to get the court’s interpretation of the protective order and to see if a Judge agreed with SAS’s position. In the meantime, I removed the prior post (which is now back up here).

The great news is that the Judge found in HFT’s favor. He found that neither the “EVAAS Litigation Update” nor the related blog post violated the protective order. Further, he found that “we” have the right to share other updates on the Houston lawsuit, which is still pending, as long as the updates do not violate the protective order still in place. This includes discussion of the conclusions or findings of experts, provided that the source code is not disclosed, either explicitly or by necessary implication.

In more specific terms, as per his ruling in his Court Order, the judge ruled that SAS Institute Inc.’s lawyers “interpret[ed] the protective order too broadly in this instance. Rothstein’s opinion regarding the inability to verify or replicate a teacher’s EVAAS score essentially mimics the allegations of HFT’s complaint. The Litigation Update made clear that Rothstein confirmed this opinion after review of the source code; but it [was] not an opinion ‘that could not have been made in the absence of [his] review’ of the source code. Rothstein [also] testified by affidavit that his opinion is not based on anything he saw in the source code, but on the extremely restrictive access permitted by SAS.” He added that “the overly broad interpretation urged by SAS would inhibit legitimate discussion about the lawsuit, among both the union’s membership and the public at large.” That, also in his words, would be an “unfortunate result” that should, in the future, be avoided.

Here, again, are the 12 key highlights of the EVAAS Litigation Update:
  • Large-scale standardized tests have never been validated for their current uses. In other words, as per my affidavit, “VAM-based information is based upon large-scale achievement tests that have been developed to assess levels of student achievement, but not levels of growth in student achievement over time, and not levels of growth in student achievement over time that can be attributed back to students’ teachers, to capture the teachers’ [purportedly] causal effects on growth in student achievement over time.”
  • The EVAAS produces different results from another VAM. When, for this case, Rothstein constructed and ran an alternative, albeit sophisticated VAM using data from HISD both times, he found that results “yielded quite different rankings and scores.” This should not happen if these models are indeed yielding indicators of truth, or true levels of teacher effectiveness from which valid interpretations and assertions can be made.
  • EVAAS scores are highly volatile from one year to the next. Rothstein, when running the actual data, found that while “[a]ll VAMs are volatile…EVAAS growth indexes and effectiveness categorizations are particularly volatile due to the EVAAS model’s failure to adequately account for unaccounted-for variation in classroom achievement.” In addition, volatility is “particularly high in grades 3 and 4, where students have relatively few[er] prior [test] scores available at the time at which the EVAAS scores are first computed.”
  • EVAAS overstates the precision of teachers’ estimated impacts on growth. As per Rothstein, “This leads EVAAS to too often indicate that teachers are statistically distinguishable from the average…when a correct calculation would indicate that these teachers are not statistically distinguishable from the average.”
  • Teachers of English Language Learners (ELLs) and “highly mobile” students are substantially less likely to demonstrate added value, as per the EVAAS, and likely most/all other VAMs. This, what we term as “bias,” makes it “impossible to know whether this is because ELL teachers [and teachers of highly mobile students] are, in fact, less effective than non-ELL teachers [and teachers of less mobile students] in HISD, or whether it is because the EVAAS VAM is biased against ELL [and these other] teachers.”
  • The number of students each teacher teaches (i.e., class size) also biases teachers’ value-added scores. As per Rothstein, “teachers with few linked students—either because they teach small classes or because many of the students in their classes cannot be used for EVAAS calculations—are overwhelmingly [emphasis added] likely to be assigned to the middle effectiveness category under EVAAS (labeled “no detectable difference [from average], and average effectiveness”) than are teachers with more linked students.”
  • Ceiling effects are certainly an issue. Rothstein found that in some grades and subjects, “teachers whose students have unusually high prior year scores are very unlikely to earn high EVAAS scores, suggesting that ‘ceiling effects‘ in the tests are certainly relevant factors.” While EVAAS and HISD have previously acknowledged such problems with ceiling effects, they apparently believe these effects are being mediated with the new and improved tests recently adopted throughout the state of Texas. Rothstein, however, found that these effects persist even given the new and improved.
  • There are major validity issues with “artificial conflation.” This is a term I recently coined to represent what is happening in Houston, and elsewhere (e.g., Tennessee), when district leaders (e.g., superintendents) mandate or force principals and other teacher effectiveness appraisers or evaluators, for example, to align their observational ratings of teachers’ effectiveness with value-added scores, with the latter being the “objective measure” around which all else should revolve, or align; hence, the conflation of the one to match the other, even if entirely invalid. As per my affidavit, “[t]o purposefully and systematically endorse the engineering and distortion of the perceptible ‘subjective’ indicator, using the perceptibly ‘objective’ indicator as a keystone of truth and consequence, is more than arbitrary, capricious, and remiss…not to mention in violation of the educational measurement field’s Standards for Educational and Psychological Testing” (American Educational Research Association (AERA), American Psychological Association (APA), National Council on Measurement in Education (NCME), 2014).
  • Teaching-to-the-test is of perpetual concern. Both Rothstein and I, independently, noted concerns about how “VAM ratings reward teachers who teach to the end-of-year test [more than] equally effective teachers who focus their efforts on other forms of learning that may be more important.”
  • HISD is not adequately monitoring the EVAAS system. According to HISD, EVAAS modelers keep the details of their model secret, even from them and even though they are paying an estimated $500K per year for district teachers’ EVAAS estimates. “During litigation, HISD has admitted that it has not performed or paid any contractor to perform any type of verification, analysis, or audit of the EVAAS scores. This violates the technical standards for use of VAM that AERA specifies, which provide that if a school district like HISD is going to use VAM, it is responsible for ‘conducting the ongoing evaluation of both intended and unintended consequences’ and that ‘monitoring should be of sufficient scope and extent to provide evidence to document the technical quality of the VAM application and the validity of its use’ (AERA Statement, 2015).
  • EVAAS lacks transparency. AERA emphasizes the importance of transparency with respect to VAM uses. For example, as per the AERA Council who wrote the aforementioned AERA Statement, “when performance levels are established for the purpose of evaluative decisions, the methods used, as well as the classification accuracy, should be documented and reported” (AERA Statement, 2015). However, and in contrast to meeting AERA’s requirements for transparency, in this district and elsewhere, as per my affidavit, the “EVAAS is still more popularly recognized as the ‘black box’ value-added system.”
  • Related, teachers lack opportunities to verify their own scores. This part is really interesting. “As part of this litigation, and under a very strict protective order that was negotiated over many months with SAS [i.e., SAS Institute Inc. which markets and delivers its EVAAS system], Dr. Rothstein was allowed to view SAS’ computer program code on a laptop computer in the SAS lawyer’s office in San Francisco, something that certainly no HISD teacher has ever been allowed to do. Even with the access provided to Dr. Rothstein, and even with his expertise and knowledge of value-added modeling, [however] he was still not able to reproduce the EVAAS calculations so that they could be verified.”Dr. Rothstein added, “[t]he complexity and interdependency of EVAAS also presents a barrier to understanding how a teacher’s data translated into her EVAAS score. Each teacher’s EVAAS calculation depends not only on her students, but also on all other students with- in HISD (and, in some grades and years, on all other students in the state), and is computed using a complex series of programs that are the proprietary business secrets of SAS Incorporated. As part of my efforts to assess the validity of EVAAS as a measure of teacher effectiveness, I attempted to reproduce EVAAS calculations. I was unable to reproduce EVAAS, however, as the information provided by HISD about the EVAAS model was far from sufficient.”

New Mexico’s Teacher Evaluation Trial Postponed Until October, w/Preliminary Injunction Still in Place

Last December in New Mexico, a Judge granted a preliminary injunction preventing consequences from being attached to the state’s teacher evaluation data as based on the state’s value-added model (VAM). More specifically, Judge David K. Thomson ruled that the state can proceed with “developing” and “improving” its teacher evaluation system, but the state is not to make any consequential decisions about New Mexico’s teachers using the data the state collects until the state (and/or others external to the state) can evidence to the court during another trial (which was set for April of 2016) that the system is reliable, valid, fair, uniform, and the like. See more details regarding Judge Thomson’s ruling in a previous post here: “Consequences Attached to VAMs Suspended Throughout New Mexico.” See more details about this specific lawsuit, sponsored by the American Federation of Teachers (AFT) New Mexico and the Albuquerque Teachers Federation (ATF), in a previous post here: “Lawsuit in New Mexico Challenging [the] State’s Teacher Evaluation System.” This is one of the cases on which I am continuing to serve as an expert witness.

Yesterday, however, and given another state-level lawsuit that is also ongoing regarding the state’s teacher evaluation system, although this one is sponsored by the National Education Association (NEA), Judge Thomson (apparently along with Judge Francis Mathew) pushed both the AFT-NM/ATF and NEA trials back to October of 2016, yielding a six month delay for the AFT-NM/ATF hearing.

According to an article published this morning in the Santa Fe New Mexican, “To date, the [New Mexico] Public Education Department [PED] has been unsuccessful in its efforts to stop either suit or combine them;” hence, yesterday in court the state requested that the court postpone both hearings so that the state could introduce its new teacher evaluation system, on March 15 of 2016, along with its specifics and rules, as also based on the state’s new Partnership for the Assessment of Readiness for College and Careers (PARCC) test data. Recall that the state’s Secretary of Education – Hanna “Skandera is new chair of PARCC test board.” It is also anticipated, however, that the state’s new system is to still “rely heavily” (i.e., 50% weight) on VAMs. See also a related post about “New Mexico Chang[ing] its Teacher Evaluation System, But Not Really.”

This window of time is also to allow for the public forums needed to review the state’s new system, but also to allow time for “the acrimony to be resolved without trials.” The preliminary injunction granted by Judge Thomson in December, though, still remains in place. See also a related article, also published this morning, in the Albuquerque Journal.

Stephanie Ly, president of the AFT-NM, said she is not happy with the trial being postponed. She called this a “stalling tactic” to give the [state] education department more time to compile student achievement data that the plaintiffs have been requesting. “We had no option but to agree because they are withholding data,” she said.

Ly and ATF President Ellen Bernstein also responded yesterday via a joint statement, pasted in full below:

March 7, 2016

Contact: John Dyrcz — 505-554-8679

“The Public Education Department and Secretary Skandera have once again willfully delayed the AFT NM/ATF lawsuit against the current value added model [VAM] evaluation system due to their purposeful refusal to reveal the data being used to evaluate our educators in New Mexico.

“In addition to this stall tactic, and during a status hearing this morning in the First District Court, lawyers for the PED revealed that new rules and regulations were to be unveiled on March 15 by the PED, and would ‘rely heavily’ on VAM as a method of evaluation for educators.

“New Mexico educators will not cease in our fight against the abusive policies of this administration. Allowing PED or districts to terminate employees based on VAM and student test scores is completely unacceptable, it is unacceptable to allow PED or districts to refuse licensure advancement based upon VAM scores, and it is unacceptable for PED or districts to place New Mexico educators on growth plans based on faulty data.

“High-performing education systems have policies 
in place which respect and support their educators and use evaluations not as punitive measures but as opportunities for improvement. Educators, unions, and administrators should oversee the evaluation process to ensure it is thorough and of high quality, as well as fair and reliable. Educators, unions, and administrators should be involved in developing, implementing and monitoring the system to ensure it reflects good teaching well, that it operates effectively, that it is tied to useful learning opportunities for teachers, and that it produces valid results.

“It is well known the PED is in a current state of crisis with several high-level staff members abandoning the Department, an on-going whistle-blower lawsuit…the failure to produce meaningful changes to education in New Mexico during her six years as Secretary, and Skandera’s constant changes to the rules is a desperate attempt to right a sinking ship,” said Ly and Bernstein.

Vergara v. California Appeal Underway: The Case that Will Yield No Winners

In June of 2014, defendants in “Vergara v. California” in Los Angeles, California lost the case. Plaintiffs included nine public school students (backed by some serious corporate reformer funds as per Students Matter) who challenged five California state statutes that supported the state’s “ironclad [teacher] tenure system.” The prosecution’s argument was that students’ rights to a good education were being violated by teachers’ job protections…protections that were making it too difficult to fire “grossly ineffective” teachers. The prosecution’s suggested replacement to the “old” way of doing this, of course, was to use value-added scores to make “better” decisions about which teachers to fire and whom to keep around, as based on teachers’ causal impacts on students’ “data.”

This week, this case is being appealed, back in Los Angeles (see a recent Education Week article on the appeal here; see also the Students Matter website for daily appeal updates here). This, accordingly, is a very important case to watch, especially as many agree that this case will eventually end up in no lesser than the state’s Supreme Court.

On this note, though, I came across a great article, also in Education Week, this morning, capturing as per the article’s title, the “Five Reasons Vergara Is Still Unwinnable.” I already tweeted this one out, but for those of you not following us on Twitter, I didn’t want you to miss this one.

The author — Charles Taylor Kerchner, Research Professor at Claremont Graduate University — puts the key pieces of the case in context as well as under a fair and appropriate light, more specifically explaining why “this is a case that the plaintiffs can’t win and the defendants will lose regardless of the outcome.” This, in other words and as per his opinion, is a case that will ultimately yield no winners.

Do read Kerchner’s full Education Week piece here, and share out as you see fit. I’ve also copied/pasted the text below (e.g., for those of you who follow via email).

*****

As the trial court arguments concluded in the spring of 2014, one of the first ‘On California’ posts argued that, “from our perspective this is a case that the plaintiffs can’t win and the defendants will lose regardless of the outcome.”  It still is.

Oral arguments on its appeal began last week, a decision is due in 90 days, and an appeal to the state Supreme Court is considered a near certainty.  Just in case you haven’t been listening to the well-oiled noise machine surrounding the case, EdWeek’s Stephen Sawchuk provides a backgrounder.

Teacher Labor Market Realities

First of all, the plaintiffs can’t win this case because they don’t understand—or willfully ignore—the realities of the teacher labor market.  The underlying problem in the supply and demand for teachers is not that young very good teachers were being fired while old sluggish ones held on to their jobs.  As the recent data on teacher shortages shows, the problem is attracting good people to teaching in the first place and holding onto them.  Most young teachers who teach in challenging schools leave because the work is too hard, not because they were laid off. 

If the plaintiffs really want to increase the quality of the teacher work force, then they should put their money behind efforts to forgive student loans or provide residency programs for novice teachers so that they are not dissuaded by the shock of stepping into a classroom without a solid grounding in the practicalities of teaching.

Value Added Testing

Second, accepting Vergara equates to accepting value added testing as a valid means of assessing teacher performance.  Value added testing began as an attempt to substitute achievement gains for the more socially biased “league table” ranking of schools.  Its early advocates used the technique to demonstrate the influence that a good teacher has on a student’s long-term academic progress and economic life chances.  The economists that argued for the Vergara plaintiffs made much of this reasoning.

Unfortunately, , value added systems are usually terrible when they are put in place. The “value” in value-added are nearly always scores on state standardized tests.  Some of these tests are not very good indicators.  For example, nearly all the state tests used by Vergara plaintiffs have been replaced by measures more aligned with the Common Core of state standards.

Most of the tests are only given in a few grades in a few subjects.  Teachers in other grades and subjects get a composite score based on how well the whole school or an entire grade performed, a score that has little to do with that teacher’s value added.

It’s nonsense to use such gross statistical artifacts as the means to dismiss a teacher, or to reward one.  (A Tennessee case featured a teacher who was denied a bonus because his value added scores didn’t make the cut.  He taught largely advanced students, who were not required to take the state tests, and thus his entire value added score rested on one class.)

Disparate Impact

Third, the case accepts the constitutional principle of “disparate impact.”  This evidentiary argument has its origins in housing discrimination cases where it has been held that a law or practice, such as a bank’s lending policy, need not be discriminatory on its face if its impact was unfairly felt. 

If one accepts that people of color are generally discriminated against, and that poor people of color are absolutely discriminated against, then any rule or regulation within the education system is vulnerable to a disparate impact challenge.  Any form of teacher tenure?  Licenses to teach?  A pension system that encourages older teachers to stay instead of making way for young, enthusiastic ones?  School district boundaries?  Civil service protections?  Because all these exist in an inherently discriminatory environment, they would all be vulnerable if Vergara were upheld.

Rich People and Simplistic Solutions

Fourth, Vergara points rich people toward simplistic solutions.  Venture philanthropy is built around the assumption that people with wealth can use their money to disrupt institutions rather than support existing ones.  Students Matter, which is bankrolling the Vergara lawsuit, is a good example. 

It tinkers with three relatively inconsequential aspects of teacher quality while ignoring the much more fundamental changes in teaching and learning that need to take place in order to create a 21st Century education system.

At least as a thought experiment, people with money ought to be required to specify where they are headed.  If public monopoly, which every high performing school system in the world uses to deliver education, is bad, then specify the alternative.  Hiding behind empty phrases such as “grossly incompetent teachers,” derived from a statistical analysis of state test scores, is no substitute for the hard intellectual work of designing a novel education system.

I’m with the so-called reformers in the belief that the education system put in place more than a century ago needs transformation, but certainly those who want to change it should be required to come up with something better than increasing the amount of time it takes to get tenure by 12 months.

Buying Bullets for Your Opponents

Fifth, Vergara has created yet another instance in which the California Teachers Association and the California Federation of Teachers can inflict damage on themselves.  I hope they prevail in this appeal.  They should.  But in winning, they lose.  They will continue to be a target of opportunity by Republicans and an object of scorn among school reformers. 

They have utterly failed to seize the opportunity for policy leadership presented by the lawsuit and the unprecedented but transitory political support they currently enjoy in Sacramento.

Rather than build on strength, a siege mentality has overtaken union leaders, as in “they’re all around us.”  If that’s the case, you’d think that the unions would quit supplying their opponents with ammunition.

I hope the appellate justices overturn Vergara, but regardless, the case will yield no winners.

 

Tennessee’s Trout/Taylor Value-Added Lawsuit Dismissed

As you may recall, one of 15 important lawsuits pertaining to teacher value-added estimates across the nation (Florida n=2, Louisiana n=1, Nevada n=1, New Mexico n=4, New York n=3, Tennessee n=3, and Texas n=1 – see more information here) was situated in Knox County, Tennessee.

Filed in February of 2015, with legal support provided by the Tennessee Education Association (TEA), Knox County teacher Lisa Trout and Mark Taylor charged that they were denied monetary bonuses after their Tennessee Value-Added Assessment System (TVAAS — the original Education Value-Added Assessment System (EVAAS)) teacher-level value-added scores were miscalculated. This lawsuit was also to contest the reasonableness, rationality, and arbitrariness of the TVAAS system, as per its intended and actual uses in this case, but also in Tennessee writ large. On this case, Jesse Rothstein (University of California – Berkeley) and I were serving as the Plaintiffs’ expert witnesses.

Unfortunately, however, last week (February 17, 2016) the Plaintiffs’ team received a Court order written by U.S. District Judge Harry S. Mattice Jr. dismissing their claims. While the Court had substantial questions about the reliability and validity of the TVAAS, the Court determined that the State satisfied the very low threshold of the “rational basis test,” at legal issue. I should note here, however, that all of the evidence that the lawyers for the Plaintiffs collected via their “extensive discovery,” including the affidavits both Jesse and I submitted on Plaintiffs’ behalves, were unfortunately not considered in Judge Mattice’s motion to dismiss. This, perhaps, makes sense given some of the assertions made by the Court, forthcoming.

Ultimately, the Court found that the TVAAS-based, teacher-level value-added policy at issue was “rationally related to a legitimate government interest.” As per the Court order itself, Judge Mattice wrote that “While the court expresses no opinion as to whether the Tennessee Legislature has enacted sound public policy, it finds that the use of TVAAS as a means to measure teacher efficacy survives minimal constitutional scrutiny. If this policy proves to be unworkable in practice, plaintiffs are not to be vindicated by judicial intervention but rather by democratic process.”

Otherwise, as per an article in the Knoxville News Sentinel, Judge Mattice was “not unsympathetic to the teachers’ claims,” for example, given the TVAAS measures “student growth — not teacher performance — using an algorithm that is not fail proof.” He inversely noted, however, in the Court order that the “TVAAS algorithms have been validated for their accuracy in measuring a teacher’s effect on student growth,” even if minimal. He also wrote that the test scores used in the TVAAS (and other models) “need not be validated for measuring teacher effectiveness merely because they are used as an input in a validated statistical model that measures teacher effectiveness.” This is, unfortunately, untrue. Nonetheless, he continued to write that even though the rational basis test “might be a blunt tool, a rational policymaker could conclude that TVAAS is ‘capable of measuring some marginal impact that teachers can have on their own students…[and t]his is all the Constitution requires.”

In the end, Judge Mattice concluded in the Court order that, overall, “It bears repeating that Plaintiff’s concerns about the statistical imprecision of TVAAS are not unfounded. In addressing Plaintiffs’ constitutional claims, however, the Court’s role is extremely limited. The judiciary is not empowered to second-guess the wisdom of the Tennessee legislature’s approach to solving the problems facing public education, but rather must determine whether the policy at issue is rationally related to a legitimate government interest.”

It is too early to know whether the prosecution team will appeal, although Judge Mattice dismissed the federal constitutional claims within the lawsuit “with prejudice.” As per an article in the Knoxville News Sentinel, this means that “it cannot be resurrected with new facts or legal claims or in another court. His decision can be appealed, though, to the 6th Circuit U.S. Court of Appeals.”

New York Teacher Sheri Lederman’s Lawsuit Update

Recall the New York lawsuit pertaining to Long Island teacher Sheri Lederman? The teacher who by all accounts other than her recent (2013-2014) 1 out of 20 growth score is a terrific 4th grade, 18 year veteran teacher. She, along with her attorney and husband Bruce Lederman, are suing the state of New York to challenge the state’s growth-based teacher evaluation system. See prior posts about Sheri’s case herehere and here. I, along with Linda Darling-Hammond (Stanford), Aaron Pallas (Columbia University Teachers College), Carol Burris (Executive Director of the Network for Public Education Foundation), Brad Lindell (Long Island Research Consultant), Sean Corcoran (New York University) and Jesse Rothstein (University of California – Berkeley) are serving as part of Sheri’s team.

Bruce Lederman just emailed me with an update, and some links re: this update (below), and he gave me permission to share all of this with you.

The judge hearing this case recently asked the lawyers on both sides of Sheri’s case to brief the court by the end of this month (February 29, 2016) on a new issue, positioned and pushed back into the court by the New York State Education Department (NYSED). The issue to be heard pertains to the state’s new “moratorium” or “emergency regulations” related to the state’s high-stakes use of its growth scores, all of which is likely related to the political reaction to the opt-out movement throughout the state of New York, the publicity pertaining to the Lederman lawsuit in and of itself, and the federal government’s adoption of the recent Every Student Succeeds Act (ESSA) given its specific provision that now permits states to decide whether (and if so how) to use teachers’ students’ test scores to hold teachers accountable for their levels of growth (in New York) or value-added.

While the federal government did not abolish such practices via its ESSA, the federal government did hand back to the states all power and authority over this matter. Accordingly, this does not mean growth models/VAMs are going to simply disappear, as states do still have the power and authority to move forward with their prior and/or their new teacher evaluation systems, based in small or large part, on growth models/VAMs. As also quite evident since President Obama’s signing of the ESSA, some states are continuing to move forward in this regard, and regardless of the ESSA, in some cases at even higher speeds than before, in support of what some state policymakers still apparently believe (despite the research) are the accountability measures that will still help them to (symbolically) support educational reform in their states. See, for example, prior posts about the state of Alabama, here, New Mexico, here, and Texas, here, which is still moving forward with its plans introduced pre-ESSA. See prior posts about New York here, here, and here, the state in which also just one year ago Governor Cuomo was promoting increased use of New York’s growth model and publicly proclaiming that it was “baloney” that more teachers were not being found “ineffective,” after which Cuomo pushed through the New York budget process amendments to the law increasing the weight of teachers’ growth scores to an approximate 50% weight in many cases.

Nonetheless, as per this case in New York, state Attorney General Eric Schneiderman, on behalf of the NYSED, offered to settle this lawsuit out of court by giving Sheri some accommodation on her aforementioned 2013-2014 score of 1 out of 20, if Sheri and Bruce dropped the challenge to the state’s VAM-based teacher evaluation system. Sheri and Bruce declined, for a number or reasons, including that under the state’s recent “moratorium,” the state’s growth model is still set to be used throughout the state of New York for the next four years, with teachers’ annual performance reviews based in part on growth scores reported to parents, newspapers (on an aggregate basis), and the like. While, again, high-stakes are not to be attached to the growth output for four years, the scores will still “count.”

Hence, Sheri and Bruce believe that because they have already “convincingly” shown that the state’s growth model does not “rationally” work for teacher evaluation purposes, and that teacher evaluations as based on the state’s growth model actually violate state law since teachers like Sheri are not capable of getting perfect scores (which is “irrational”), they will continue with this case, also on behalf of New York teachers and principals who are “demoralized” by the system, as well as New York taxpayers who are paying (millions “if not tens of millions of dollars” for the system’s (highly) unreliable and inaccurate results.

As per Bruce’s email: “Spending the next 4 years studying a broken system is a terrible idea and terrible waste of taxpayer $$s. Also, if [NYSED] recognizes that Sheri’s 2013-14 score of 1 out of 20 is wrong [which they apparently recognize given their offer to settle this suit out of court], it’s sad and frustrating that [NYSED] still wants to fight her score unless she drops her challenge to the evaluation system in general.”

“We believe our case is already responsible for the new administrative appeal process in NY, and also partly responsible for Governor Cuomos’ apparent reversal on his stand about teacher evaluations. However, at this point we will not settle and allow important issues to be brushed under the carpet. Sheri and I are committed to pressing ahead with our case.”

To read more about this case via a Politico New York article click here (registration required). To hear more from Bruce Lederman about this case via WCNY-TV, Syracuse, click here. The pertinent section of this interview starts at 22:00 minutes and ends at 36:21. It’s well worth listening!

New Mexico to Change its Teacher Evaluation System, But Not Really

As you all likely recall, the American Federation of Teachers (AFT), joined by the Albuquerque Teachers Federation (ATF), last year, filed a “Lawsuit in New Mexico Challenging [the] State’s Teacher Evaluation System.” In December 2015, state District Judge David K. Thomson granted a preliminary injunction preventing consequences from being attached to the state’s teacher evaluation data. More specifically, Judge Thomson ruled that the state can proceed with “developing” and “improving” its teacher evaluation system, but the state is not to make any consequential decisions about New Mexico’s teachers using the data the state collects until the state (and/or others external to the state) can evidence to the court that the system is reliable, valid, fair, uniform, and the like (see prior post on this ruling here).

Late Friday afternoon, New Mexico’s Public Education Department (PED) announced that they are accordingly changing their NMTEACH teacher evaluation system, and they will be issuing new regulations. Their primary goal is as follows: To (1) “Address major liabilities resulting from litigation” as these liabilities specifically pertain to the former NMTEACH system’s (a) Uniformity, (b) Transparency, and (c) Clarity. On the surface level, this is gratifying to the extent that the state is attempting to, at least theoretically, please the court. But we, and especially those in New Mexico, might refrain from celebrating too soon…given when one reads the PED announcement here, one will see this is yet another example of the state’s futile attempts to keep with a very top-down teacher evaluation system. Note, however, that a uniform teacher evaluation system is also required under state law, although the governor has the right to change state statute should those at the state (including the governor, state superintendent, and PED) decide to eventually work with local districts and schools regarding the construction of a better teacher evaluation system for the state.

As per the PED’s subsequent goals, accordingly, things do not look much different than what they did in the past, especially given why and what got the state involved in such litigation in the first place. While the state also intends to (2) Simplify processes for districts/charters and also for the PED, and this is more or less fair, the state is also to (3) Establish a timeline for providing to districts and schools more current data in that currently such data are delayed by one school year, and these data are (still) needed for the state’s Pay for Performance plans (which was considered one high-stakes consequence at issue in Judge Thompson’s ruling). A tertiary goal is also to deliver in a more timely fashion such data to teacher preparation programs, which is something also of great controversy, as (uninformed) policymakers also continue to believe that colleges of education should also be held accountable for the test scores of their graduates’ students (see why this is problematic, for example, here). In the state’s final expressed goal, they make it explicit that (4) “Moving the timeline enhances the understanding that this system isn’t being used for termination decisions.” While this is certainly good, at least for now, the performance pay program is still something that is of concern. As is the state’s continued attempts to (still) use students’ test scores to evaluate teachers, and the state’s perpetual beliefs that the data errors also exposed by the lawsuit were the fault of the school districts, not the state, which Judge Thomson also noted.

Regardless, here is the state’s “Legal Rationale,” and here is also where things go a bit more awry. As re-positioned by the state/PED, they write that “the NEA and AFT recently advanced lawsuits set on eliminating any meaningful teacher evaluation [emphasis added to highlight the language that state is using to distort the genuine purposes of these lawsuits]. These lawsuits have exposed that the flexibility provided to local authorities has created confusion and complexity. Judge Thomson used this complexity when granting an injunction in the AFT case—citing a confusing array of classifications, tags, assessments, graduated considerations, etc. Judge Thomson made clear that he views this local authority as a threat to the statutorily required uniformity of the system [emphasis added given Judge Thompson said nothing of this sort, in terms of devaluing local authority or control, but rather, he emphasized the state’s menu of options was arbitrary and not uniform, especially given the consequences the state was requiring districts to enforce].” This, again, pertains to what is written in the current state statute in terms of a uniform teacher evaluation system.

Accordingly, and unfortunately, the state’s proposed changes would: “Provide a single plan that all districts and charters would use, providing greater uniformity,” and “Simplify the model from 107 possible classifications to three.” See three other moves detailed in the PED announcement here (e.g., moving data delivery dates, eliminating all but three tests, and the fall 2016 date which all of this is to become official).

Related, see a visual of what the state’s “new and improved” teacher evaluation system, in response to said litigation, is to look like. Unfortunately, again, it really does not look much different than it did prior except, perhaps, in the proposed reductions of testing options. See also the full document from which all of this came here.

Screen Shot 2016-01-30 at 10.20.01 AM

Nonetheless, we will have to wait to see if this, again, will please the court, and Judge Thompson’s ruling that the state is not to make any consequential decisions about New Mexico’s teachers using the data the state collects until the state (and/or others external to the state) can evidence to the court that the system is reliable, valid, etc.

And as for what the President of the American Federation of Teachers (AFT) New Mexico – Stephanie Biondo-Ly – had to say in response, see her press release below. See also an article in the Las Cruces – Sun Times here, in which President Ly is cited as “denounc[ing] the changes and call[ing] them attempts to obscure deficiencies in the [state’s] evaluation system.” From her original press release, she also wrote: “We are troubled…that once again, these changes are being implemented from the top down and if the secretary [Hanna Skandera] and her [PED] staff were serious about improving student outcomes and producing a fair evaluation system, they would have involved teachers, principals, and superintendents in the process.”

here

Houston Lawsuit Update, with Summary of Expert Witnesses’ Findings about the EVAAS

Recall from a prior post that a set of teachers in the Houston Independent School District (HISD), with the support of the Houston Federation of Teachers (HFT) are taking their district to federal court to fight for their rights as professionals, and how their value-added scores, derived via the Education Value-Added Assessment System (EVAAS), have allegedly violated them. The case, Houston Federation of Teachers, et al. v. Houston ISD, is to officially begin in court early this summer.

More specifically, the teachers are arguing that EVAAS output are inaccurate, the EVAAS is unfair, that teachers are being evaluated via the EVAAS using tests that do not match the curriculum they are to teach, that the EVAAS system fails to control for student-level factors that impact how well teachers perform but that are outside of teachers’ control (e.g., parental effects), that the EVAAS is incomprehensible and hence very difficult if not impossible to actually use to improve upon their instruction (i.e., actionable), and, accordingly, that teachers’ due process rights are being violated because teachers do not have adequate opportunities to change as a results of their EVAAS results.

The EVAAS is the one value-added model (VAM) on which I’ve conducted most of my research, also in this district (see, for example, here, here, here, and here); hence, I along with Jesse Rothstein – Professor of Public Policy and Economics at the University of California – Berkeley, who also conducts extensive research on VAMs – are serving as the expert witnesses in this case.

What was recently released regarding this case is a summary of the contents of our affidavits, as interpreted by authors of the attached “EVAAS Litigation UPdate,” in which the authors declare, with our and others’ research in support, that “Studies Declare EVAAS ‘Flawed, Invalid and Unreliable.” Here are the twelve key highlights, again, as summarized by the authors of this report and re-summarized, by me, below:

  1. Large-scale standardized tests have never been validated for their current uses. In other words, as per my affidavit, “VAM-based information is based upon large-scale achievement tests that have been developed to assess levels of student achievement, but not levels of growth in student achievement over time, and not levels of growth in student achievement over time that can be attributed back to students’ teachers, to capture the teachers’ [purportedly] causal effects on growth in student achievement over time.”
  2. The EVAAS produces different results from another VAM. When, for this case, Rothstein constructed and ran an alternative, albeit sophisticated VAM using data from HISD both times, he found that results “yielded quite different rankings and scores.” This should not happen if these models are indeed yielding indicators of truth, or true levels of teacher effectiveness from which valid interpretations and assertions can be made.
  3. EVAAS scores are highly volatile from one year to the next. Rothstein, when running the actual data, found that while “[a]ll VAMs are volatile…EVAAS growth indexes and effectiveness categorizations are particularly volatile due to the EVAAS model’s failure to adequately account for unaccounted-for variation in classroom achievement.” In addition, volatility is “particularly high in grades 3 and 4, where students have relatively few[er] prior [test] scores available at the time at which the EVAAS scores are first computed.”
  4. EVAAS overstates the precision of teachers’ estimated impacts on growth. As per Rothstein, “This leads EVAAS to too often indicate that teachers are statistically distinguishable from the average…when a correct calculation would indicate that these teachers are not statistically distinguishable from the average.”
  5. Teachers of English Language Learners (ELLs) and “highly mobile” students are substantially less likely to demonstrate added value, as per the EVAAS, and likely most/all other VAMs. This, what we term as “bias,” makes it “impossible to know whether this is because ELL teachers [and teachers of highly mobile students] are, in fact, less effective than non-ELL teachers [and teachers of less mobile students] in HISD, or whether it is because the EVAAS VAM is biased against ELL [and these other] teachers.”
  6. The number of students each teacher teaches (i.e., class size) also biases teachers’ value-added scores. As per Rothstein, “teachers with few linked students—either because they teach small classes or because many of the students in their classes cannot be used for EVAAS calculations—are overwhelmingly [emphasis added] likely to be assigned to the middle effectiveness category under EVAAS (labeled “no detectable difference [from average], and average effectiveness”) than are teachers with more linked students.”
  7. Ceiling effects are certainly an issue. Rothstein found that in some grades and subjects, “teachers whose students have unusually high prior year scores are very unlikely to earn high EVAAS scores, suggesting that ‘ceiling effects‘ in the tests are certainly relevant factors.” While EVAAS and HISD have previously acknowledged such problems with ceiling effects, they apparently believe these effects are being mediated with the new and improved tests recently adopted throughout the state of Texas. Rothstein, however, found that these effects persist even given the new and improved.
  8. There are major validity issues with “artificial conflation.” This is a term I recently coined to represent what is happening in Houston, and elsewhere (e.g., Tennessee), when district leaders (e.g., superintendents) mandate or force principals and other teacher effectiveness appraisers or evaluators, for example, to align their observational ratings of teachers’ effectiveness with value-added scores, with the latter being the “objective measure” around which all else should revolve, or align; hence, the conflation of the one to match the other, even if entirely invalid. As per my affidavit, “[t]o purposefully and systematically endorse the engineering and distortion of the perceptible ‘subjective’ indicator, using the perceptibly ‘objective’ indicator as a keystone of truth and consequence, is more than arbitrary, capricious, and remiss…not to mention in violation of the educational measurement field’s Standards for Educational and Psychological Testing” (American Educational Research Association (AERA), American Psychological Association (APA), National Council on Measurement in Education (NCME), 2014).
  9. Teaching-to-the-test is of perpetual concern. Both Rothstein and I, independently, noted concerns about how “VAM ratings reward teachers who teach to the end-of-year test [more than] equally effective teachers who focus their efforts on other forms of learning that may be more important.”
  10. HISD is not adequately monitoring the EVAAS system. According to HISD, EVAAS modelers keep the details of their model secret, even from them and even though they are paying an estimated $500K per year for district teachers’ EVAAS estimates. “During litigation, HISD has admitted that it has not performed or paid any contractor to perform any type of verification, analysis, or audit of the EVAAS scores. This violates the technical standards for use of VAM that AERA specifies, which provide that if a school district like HISD is going to use VAM, it is responsible for ‘conducting the ongoing evaluation of both intended and unintended consequences’ and that ‘monitoring should be of sufficient scope and extent to provide evidence to document the technical quality of the VAM application and the validity of its use’ (AERA Statement, 2015).
  11. EVAAS lacks transparency. AERA emphasizes the importance of transparency with respect to VAM uses. For example, as per the AERA Council who wrote the aforementioned AERA Statement, “when performance levels are established for the purpose of evaluative decisions, the methods used, as well as the classification accuracy, should be documented and reported” (AERA Statement, 2015). However, and in contrast to meeting AERA’s requirements for transparency, in this district and elsewhere, as per my affidavit, the “EVAAS is still more popularly recognized as the ‘black box’ value-added system.”
  12. Related, teachers lack opportunities to verify their own scores. This part is really interesting. “As part of this litigation, and under a very strict protective order that was negotiated over many months with SAS [i.e., SAS Institute Inc. which markets and delivers its EVAAS system], Dr. Rothstein was allowed to view SAS’ computer program code on a laptop computer in the SAS lawyer’s office in San Francisco, something that certainly no HISD teacher has ever been allowed to do. Even with the access provided to Dr. Rothstein, and even with his expertise and knowledge of value-added modeling, [however] he was still not able to reproduce the EVAAS calculations so that they could be verified.”Dr. Rothstein added, “[t]he complexity and interdependency of EVAAS also presents a barrier to understanding how a teacher’s data translated into her EVAAS score. Each teacher’s EVAAS calculation depends not only on her students, but also on all other students with- in HISD (and, in some grades and years, on all other students in the state), and is computed using a complex series of programs that are the proprietary business secrets of SAS Incorporated. As part of my efforts to assess the validity of EVAAS as a measure of teacher effectiveness, I attempted to reproduce EVAAS calculations. I was unable to reproduce EVAAS, however, as the information provided by HISD about the EVAAS model was far from sufficient.”