ASU Regents’ Professor Emeritus David Berliner at Vergara v. California

As you (hopefully) recall from a prior post, nine “students” from the Los Angeles School District are currently suing the state of California “arguing that their right to a good education is [being] violated by job protections that make it too difficult to fire bad [teachers].” This case is called Vergara v. California, and it is meant to challenge “the laws that handcuff schools from giving every student an equal opportunity to learn from effective teachers.” Behind these nine students stand a Silicon Valley technology magnate (David Welch), who is financing the case and an all-star cast of lawyers, and Students Matter, the organization founded by said Welch.

This past Tuesday (March 18, 2014 – “Vergara Trial Day 28“), David C. Berliner, Regents’ Professor Emeritus here at Arizona State University (ASU), who also just happens to be my forever mentor and academic luminary, took the stand. He spoke, primarily, about the out-of-school factors that impact student performance in schools and how this impacts and biases all estimates based on test scores (often regardless of the controls uses – see a most recent post about this evidence of bias here).

As per a recent post by Diane Ravitch (thanks to an insider at the trial) Berliner said:

“The public and politicians and parents overrate the in-school effects on their children and underrate the power of out-of-school effects on their children.” He noted that in-school factors account for just 20 percent of the variation we see in student achievement scores.

He also discussed value-added models and the problems with solely relying on these models for teacher evaluation. He said, “My experience is that teachers affect students incredibly. Probably everyone in this room has been affected by a teacher personally. But the effect of the teacher on the score, which is what’s used in VAM’s, or the school scores, which is used for evaluation by the Feds — those effects are rarely under the teacher’s control…Those effects are more often caused by or related to peer-group composition…”

Now, Students Matter has taken an interesting (and not surprising) take on Berliner’s testimony (given their own slant/biases given their support of this case), which can also be found at Vergara Trial Day 28. But please read this with caution as the author(s) of this summary, let’s say, twisted some of the truths in Berliner’s testimony.

Berliner’s reaction? “Boy did they twist it. Dirty politics.” Hmm…

One Teacher’s Dystopian Reality

Chris Gilbert, an English teacher from North Carolina, a state that uses the well-known and widely used (and also proprietary) Education Value-Added Assessment System (EVAAS) emailed the other day, sharing two articles he wrote for the Washington Post, on behalf of his fellow teachers, about his experiences being evaluated using the EVAAS system.

This one (click here) is definitely worth a full read, especially because this one comes directly from an educator living out VAMs in practice, in the field, and in what he terms his dystopian reality.

He writes: “In this dystopian story, teachers are evaluated by standardized test scores and branded with color-coded levels of effectiveness, students are abstracted into inhuman measures of data, and educational value is assessed by how well forecasted “growth” levels are met. Surely, this must be a fiction.”


More Value-Added Problems in DC’s Public Schools

Over the past month I have posted two entries about what’s going in in DC’s public schools with the value-added-based teacher evaluation system developed and advanced by the former School Chancellor Michelle Rhee and carried on by the current School Chancellor Kaya Henderson.

The first post was about a bogus “research” study in which National Bureau of Education Research (NBER)/University of Virginia and Stanford researchers overstated false claims that the system was indeed working and effective, despite the fact that (among other problems) 83% of the teachers in the study did not have student test scores available to measure their “value added.” The second post was about a DC teacher’s experiences being evaluated under this system (as part of the aforementioned 83%) using almost solely his administrator’s and master educator’s observational scores. Demonstrated with data in this post was how error prone this part of the DC system also evidenced itself to be.

Adding to the value-added issues in DC, it was just released by DC public school officials (the day before winter break) and then two Washington Post articles (see the first article here and the second here) that 44 DC public school teachers also received incorrect evaluation scores for the last academic year (2012-2013) because of technical errors in the ways the scores were calculated. One of the 44 teachers was fired as a result, although (s)he is now looking to be reinstated and compensated for the salary lost.

While “[s]chool officials described the errors as the most significant since the system launched a controversial initiative in 2009 to evaluate teachers in part on student test scores,” they also downplayed the situation as only impacting 44.

VAM formulas are certainly “subject to error,” and they are subject to error always, across the board, for teachers in general as well as the 470 DC public school teachers with value-added scores based on student test scores. Put more accurately, just over 10% (n=470) of all DC teachers (n=4,000) were evaluated using their students’ test scores, which is even less than the 83% mentioned above. And for about 10% of these teachers (n=44), calculation errors were found.

This is not a “minor glitch” as written into a recent Huffington Post article covering the same story, which positions the teachers’ unions as almost irrational for “slamming the school system for the mistake and raising broader questions about the system.” It is a major glitch caused both by inappropriate “weightings” of teachers’ administrator’ and master educators’ observational scores, as well as “a small technical error” that directly impacted the teachers’ value-added calculations. It is a major glitch with major implications about which others, including not just those from the unions but many (e.g., 90%) from the research community, are concerned. It is a major glitch that does warrant additional cause about this AND all of the other statistical and other errors not mentioned but prevalent in all value-added scores (e.g., the errors always found in large-scale standardized tests particularly given their non-equivalent scales, the errors caused by missing data, the errors caused by small class sizes, the errors caused by summer learning loss/gains, the errors caused by other teachers’ simultaneous and carry over effects, the errors caused by parental and peer effects [see also this recent post about these], etc.).

So what type of consequence is to be in store for those perpetuating such nonsense? Including, particularly here, those charged with calculating and releasing value-added “estimates” (“estimates” as these are not and should never be interpreted as hard data), but also the reporters who report on the issues without understanding them or reading the research about them. I, for one, would like to see them held accountable for the “value” they too are to “add” to our thinking about these social issues, but who rather detract and distract readers away from the very real, research-based issues at hand.

Who “Added Value” to My Son’s Learning and Achievement?

‘Tis the beginning of the holiday break and the end of the second semester of my son’s 5th grade year. This past Friday morning, after breakfast and before my son’s last day of school, he explained to me how his B grades in reading and language arts from his first grading period in 5th grade had now moved up to two A scores in both subject areas.

When I asked him why he thought he dramatically improved both scores, he explained how he and his friends really wanted to become “Millionaires” as per their Accelerated Reading (AR) program word count goals. He thought this helped him improve his reading and language arts homework/test/benchmark scores (i.e., peer effects). He also explained how he thought that my husband and I requesting that he read every night and every morning before school also helped him become a better reader (surprise!) and increase his homework/test/benchmark scores (i.e., parental effects). And he explained how my helping him understand his tests/benchmarks, how to take them, how to focus, and how to (in my terms) analyze and exclude the “distractor” items (i.e., the items that are often times “right,” but not as “right” as the best and most correct response options, yet that are purposefully placed on multiple choice tests to literally “distract” test-takers from the correct answers) also helped his overall performance (i.e., the effects of having educated/professional parent(s)).

So, who “added value” to my son’s learning and achievement? Therein lies the rub. Will the complicated statistics used to capture my son’s teacher’s value-added this year take all of this into consideration, or better yet, control for it and factor it out? Will controlling for my son’s 4th grade scores (and other demographic variables as available) help to control for and counter for that which happens in our home? I highly doubt it (with research evidence in support).

Complicating things further, my son’s 4th grade teacher last year liked my son, but also knowing that I am a professor in a college of education, was sure to place my son in arguably “the best” 5th grade teacher’s class this year. His teacher is indeed, wonderful, and not surprisingly a National Board Certified Teacher (NBCT), but being a professor gave my son a distinct advantage…and perhaps gave his current teacher a “value-added” advantage as well.

Herein lies another rub. Who else besides my son was non-randomly placed into this classroom with this 5th grade teacher? And what other “external effects” might other parents of other students in this class be having on their own children’s learning and achievement, outside of school, similar to those explained by my son Friday morning? Can this too be taken into consideration, or better yet, controlled for and factored out? I highly doubt it, again.

And what will the implications for this teacher be when it comes time to measure her value-added? Lucky her, she will likely get the kudos and perhaps the monetary bonus she truly deserves, thanks in large part, though, to so many things that were indeed, and continue to be, outside of her control as a teacher…things she did not, even being a phenomenal teacher, cause or have a causal impact on. Yet another uncontrollable consideration that must be considered.

Stanford Professor, Dr. Edward Haertel, on VAMs

In a recent speech and subsequent paper written by Dr. Edward Haertel – National Academy of Education member and Professor at Stanford University – he writes about VAMs and the extent to which VAMs, being based on student test scores, can be used to make reliable and valid inferences about teachers and teacher effectiveness. This is a must-read, particularly for those out there who are new to the research literature in this area. Dr. Haertel is certainly an expert here, actually one of the best we have, and in this piece he captures the major issues well.

Some of the issues highlighted include concerns about the tests used to model value-added and how their scales (falsely assumed to be as objective and equal as units on a measuring stick) complicate and distort VAM-based estimates. He also discusses the general issues with the tests almost if not always used when modeling value-added (i.e., the state-level tests mandated as per No Child Left Behind in 2002).

He discusses why VAM estimates are least trustworthy, and most volatile and error prone, when used to compare teachers who work in very different schools with very different student populations – students who do not attend schools in randomized patterns and who are rarely if ever randomly assigned to classrooms. The issues with bias, as highlighted by Dr. Haertel and also in a recent VAMboozled! post with a link to a new research article here, are probably the most major VAM-related, problems/issues going. As captured in his words, “VAMs will not simply reward or penalize teachers according to how well or poorly they teach. They will also reward or penalize teachers according to which students they teach and which schools they teach in” (Haertel, 2013, p. 12-13).

He reiterates issues with reliability, or a lack thereof. As per one research study he cites, researchers found that “a minimum of 10% of the teachers in the bottom fifth of the distribution one year were in the top fifth the next year, and conversely. Typically, only about a third of 1 year’s top performers were in the top category again the following year, and likewise, only about a third of 1 year’s lowest performers were in the lowest category again the following year. These findings are typical [emphasis added]…[While a] few studies have found reliabilities around .5 or a little higher…this still says that only half the variation in these value-added estimates is signal, and the remainder is noise [and/or error, which makes VAM estimates entirely invalid about half of the time]” (Haertel, 2013, p. 18).

Dr. Haertel also discusses other correlations among VAM estimates and teacher observational scores, VAM estimates and student evaluation scores, and VAM estimates taken from the same teachers at the same time but using different tests, all of which also yield abysmally (and unfortunately) low correlations, similar to those mentioned above.

His bottom line? “VAMs are complicated, but not nearly so complicated as the reality they are intended to represent” (Haertel, 2013, p. 12). They just do not measure well what so many believe they measure so very well.

Again, to find out more reasons and more in-depth explanations as to why, click here for the full speech and subsequent paper.

Random Assigment and Bias in VAM Estimates – Article Published in AERJ

“Nonsensical,” “impractical,” “unprofessional,” “unethical,” and even “detrimental” – these are just a few of the adjectives used by elementary school principals in Arizona to describe the use of randomized practices to assign students to teachers and classrooms. When asked whether principals might consider random assignment practices, one principal noted, “I prefer careful, thoughtful, and intentional placement [of students] to random. I’ve never considered using random placement. These are children, human beings.” Yet the value-added models (VAMs) being used in many states to measure the “valued-added” by individual teachers to their students’ learning assume that any school is as likely as any other school, and any teacher is as likely as any other teacher, to be assigned any student who is as likely as any other student to have similar backgrounds, abilities, aptitudes, dispositions, motivations, and the like.

One of my doctoral students – Noelle Paufler – and I recently reported in the highly esteemed American Educational Research Journal the results of a survey administered to all public and charter elementary principals in Arizona (see the online publication of “The Random Assignment of Students into Elementary Classrooms: Implications for Value-Added Analyses and Interpretations”). We examined the various methods used to assign students to classrooms in their schools, the student background characteristics considered in nonrandom placements, and the roles teachers and parents play in the placement process. In terms of bias, the fundamental question here was whether the use of nonrandom student assignment practices might lead to biased VAM estimates, if the nonrandom student sorting practices went beyond that which is typically controlled for in most VAM models (e.g., academic achievement and prior demonstrated abilities, special education status, ELL status, gender, giftedness, etc.).

We found that overwhelmingly, principals use various placement procedures through which administrators and teachers consider a variety of student background characteristics and student interactions to make placement decisions. In other words, student placements are by far nonrandom (contrary to the methodological assumptions to which VAM consumers often agree).

Principals frequently cited interactions between students, students’ peers, and previous teachers as justification for future placements. Principals stated that students were often matched with teachers based on their individual learning styles and respective teaching strengths. Parents also yielded considerable control over the placement process with a majority of principals stating that parents made placement requests, the majority of which are often honored.

In addition, in general, principal respondents were greatly opposed to using random student assignment methods in lieu of placement practices based on human judgment—practices they collectively agreed were in the best interest of students. Random assignment, even if necessary to produce unbiased VAM-based estimates, was deemed highly “nonsensical,” “impractical,” “unprofessional,” “unethical,” and even “detrimental” to student learning and teacher success.

The nonrandom assignment of students to classrooms has significant implications for the use of value-added models to estimate teacher effects on student learning using large-scale standardized test scores. Given the widespread use of nonrandom methods as indicated in this study, however, value-added researchers, policymakers, and educators should carefully consider the implications of their placement decisions as well as the validity of the inferences made using value-added estimates of teacher effectiveness.

Why VAMs & Merit Pay Aren’t Fair

An “oldie” (i.e., published about one year ago), but a goodie! This one is already posted in the video gallery of this site, but it recently came up again as a good, short-at-three minutes, video version, that captures some of the main issues.
Check it out and share as (so) needed!

Six Reasons Why VAMs and Merit Pay Aren’t Fair