Stanford Professor Linda Darling-Hammond at Vergara v. California

As you recall from my most recent post, this past Tuesday (March 18, 2014 – “Vergara Trial Day 28“), David C. Berliner, Regents’ Professor Emeritus at Arizona State University (ASU), testified for six hours on behalf of the defense at Vergara v. California. He spoke, primarily, about the out-of-school and in-school peer factors that impact student performance in schools and how this impacts and biases all estimates based on test scores (e.g., VAMs).

Two days later, also on the side of the defense, Stanford Professor Linda Darling-Hammond also took the stand (March 20, 2014 – “Vergara Trial Day 30“). For those of you who are not familiar with Linda Darling-Hammond, or her extensive career as one of the best, brightest, and most influential scholars in the academy of education, she is the nation’s leading expert on issues related to teacher quality, teacher recruitment and retention, teacher preparation, and, related, teacher evaluation (e.g., using value-added measures).

Thanks to a friend of Diane Ravitch, an insider at the trial, Darling-Hammond testified with the following as some of her highlights as they pertain directly to our collective interests on VAMboozled! here.

“On firing the bottom 5% of teachers…My opinion is that there are at least three reasons why firing the bottom 5 percent of teachers, as defined by the bottom 5 percent on an effectiveness continuum created by using the value-added test scores of their students on state tests, will not improve the overall effectiveness of teachers…One reason is that… value-added metrics are inaccurate for many teachers. In addition, they’re highly unstable. So the teachers who are in the bottom 5 percent in one year are unlikely to be the same teachers as who would be in the bottom 5 percent the next year, assuming they were left in place…the third reason is that when you create a system that is not oriented to attract high-quality teachers and support them in their work, that location becomes a very unattractive workplace…[we have]…empirical proof of that…situation currently in Houston, Texas [referencing my research in Houston], which has been firing many teachers at the bottom end of the value-added continuum without creating stronger overall achievement, and finding that they have fewer and fewer people who are willing to come apply for jobs in the district because with the instability of those scores, the inaccuracy and bias that they represent for groups of teachers…it’s become an unattractive place to work.”

“The statement is often made with respect to Finland that if you fire the bottom 5 percent [of teachers], we will be on a par with achievement in Finland. And Finland does none of those things. Finland invests in the quality of beginning teachers, trains them well, brings them into the classroom and supports them, and doesn’t need to fire a lot of teachers.”

“You can’t fire your way to Finland” (although this quote, also spoken by Darling-Hammond, did not come from this particular testimony).

While Students Matter (those financing this lawsuit, big time) twisted her testimony, again, like they did with the testimony of David Berliner (see the twists here), Darling-Hammond also testified about some other interesting and relevant topics. Here are some of the highlights from her testimony:

“On what a good evaluation process looks like….With respect to tenure decisions, first of all, you need to have – in the system, you need to have clear standards that you’re going to evaluate the teacher against, that express the kind of teaching practices that are expected; and a way of collecting evidence about what the teacher does in the classroom. That includes observations and may also include certain artifacts of the teacher’s work, like lesson plans, curriculum units, student work, et cetera…You need well-trained evaluators who know how to apply that instrument in a consistent and effective way…You want to have a system in which the evaluation is organized over a period of time so that the teacher is getting clarity about what they’re expected to do, feed back about what they’re doing, and so on.”

“On the problem with extending the tenure beyond two years…It’s important that while we want teachers to at some point have due process rights in their career, that that judgment be made relatively soon; and that a floundering teacher who is grossly ineffective is not allowed to continue for many years because a year is a long time in the life of a student…having the two-year mark—which means you’re making a decision usually within 19 months of the starting point of that teacher – has the interest of…encouraging districts to make that decision in a reasonable time frame so that students aren’t exposed to struggling teachers for long than they might need to be….But at the end of the [d]ay, the most important thing is not the amount of time; the most important thing is the quality and the intensity of the evaluation and support process that goes on for beginning teachers.”

“On the benefits and importance of having a system that includes support for struggling teachers…it’s important both as a part of a due process expectation; that if somebody is told they’re not meeting a standard, they should have some help to meet that standard…in such programs, we often find that half of the teachers do improve. Others may not improve, and then the decision is more well-grounded. And when it is made, there is almost never a grievance or a lawsuit that follows because there’s [been] such a strong process of help…in the cases where the assistance may not prove adequate to help an incompetent teacher become competent, the benefit is that that teacher is going to be removed from the classroom sooner.”

ASU Regents’ Professor Emeritus David Berliner at Vergara v. California

As you (hopefully) recall from a prior post, nine “students” from the Los Angeles School District are currently suing the state of California “arguing that their right to a good education is [being] violated by job protections that make it too difficult to fire bad [teachers].” This case is called Vergara v. California, and it is meant to challenge “the laws that handcuff schools from giving every student an equal opportunity to learn from effective teachers.” Behind these nine students stand a Silicon Valley technology magnate (David Welch), who is financing the case and an all-star cast of lawyers, and Students Matter, the organization founded by said Welch.

This past Tuesday (March 18, 2014 – “Vergara Trial Day 28“), David C. Berliner, Regents’ Professor Emeritus here at Arizona State University (ASU), who also just happens to be my forever mentor and academic luminary, took the stand. He spoke, primarily, about the out-of-school factors that impact student performance in schools and how this impacts and biases all estimates based on test scores (often regardless of the controls uses – see a most recent post about this evidence of bias here).

As per a recent post by Diane Ravitch (thanks to an insider at the trial) Berliner said:

“The public and politicians and parents overrate the in-school effects on their children and underrate the power of out-of-school effects on their children.” He noted that in-school factors account for just 20 percent of the variation we see in student achievement scores.

He also discussed value-added models and the problems with solely relying on these models for teacher evaluation. He said, “My experience is that teachers affect students incredibly. Probably everyone in this room has been affected by a teacher personally. But the effect of the teacher on the score, which is what’s used in VAM’s, or the school scores, which is used for evaluation by the Feds — those effects are rarely under the teacher’s control…Those effects are more often caused by or related to peer-group composition…”

Now, Students Matter has taken an interesting (and not surprising) take on Berliner’s testimony (given their own slant/biases given their support of this case), which can also be found at Vergara Trial Day 28. But please read this with caution as the author(s) of this summary, let’s say, twisted some of the truths in Berliner’s testimony.

Berliner’s reaction? “Boy did they twist it. Dirty politics.” Hmm…

Research Study: Missing Data and VAM-Based Bias

A new Assistant Professor here at ASU, from outside the College of Education but in the College of Mathematical and Natural Sciences also specializes in value-added modeling (and statistics). Her name is Jennifer Broatch, she is a rising star in this area of research, and she just sent me an article I missed, just read, and certainly found worth sharing with you all.

The peer-reviewed article, published in Statistics and Public Policy this past November, is fully cited and linked below so that you all can read it in full. But in terms of its CliffsNotes version, researchers evidenced the following two key findings:

First, researchers found that, “VAMs that include shorter test score histories perform fairly well compared to those with longer score histories.” The current thinking is that we need at least two if not three years of data to yield reliable estimates, or estimates that are consistent over time (which they should be). These authors argue that with three years of data the amount of data that go missing are not worth shooting for that target. Rather, again they argue, this is an issue of trade-offs. This is certainly something to consider, as long as we continue to understand that all of this is about “tinkering towards a utopia” (Tyack & Cuban, 1997) that I’m not at all certain exists in terms of VAMs and VAM-based accuracy.

Second, researchers found that, “the decision about whether to control for student covariates [or background/demographic variables] and schooling environments, and how to control for this information, influences [emphasis added] which types of schools and teachers are identified as top and bottom performers. Models that are less aggressive in controlling for student characteristics and schooling environments systematically identify schools and teachers that serve more advantaged students as providing the most value-added, and correspondingly, schools and teachers that serve more disadvantaged students as providing the least.”

This certainly adds evidence to the research on VAM-based bias. While there are many researchers who still claim that controlling for student background variables is unnecessary when using VAMs, and if anything bad practice because controlling for such demographics causes perverse effects (e.g., if teachers focus relatively less on such students who are given such statistical accommodations or boosts), this study adds more evidence that “to not control” for such demographics does indeed yield biased estimates. The authors do not disclose, however, how much bias is still “left over” after the controls are used; hence, this is still a very serious point of contention. Whether the controls, even if used, function appropriately is still something to be taken in earnest, particularly when consequential decisions are to be tied to VAM-based output (see also “The Random Assignment of Students into Elementary Classrooms: Implications for Value-Added Analyses and Interpretations”).

Citation: Ehlert, M., Koedel, C., Parsons, E., & Podgursky, M. (2013, November). The sensitivity of value-added estimates to specification adjustments: Evidence from school- and teacher-level models in Missouri. Statistics and Public Policy, 1(1), 19-27. doi: 10.1080/2330443X.2013.856152

Chicken or the Egg?

“Which came first, the chicken or the egg?” is an age-old question, but is more importantly a dilemma about identifying the real cases of cause and consequence.

Recall from a recent post that currently in the state of California nine public school students are challenging California’s teacher tenure system, arguing that their right to a good education is being violated by job protections that protect ineffective teachers, but do not protect the students from being instructed by said teachers. Recall, as well, that a wealthy technology magnate [David Welch] is financing the whole case, as also affiliated and backed by Students Matter. The ongoing suit is called “Vergara v. California.”

Welch and Students Matter have thus far brought to testify an all-star cast, most recently including Thomas Kane, an economics professor from Harvard University. Kane also directed the $45 million worth of Measures of Effective Teaching (MET) studies for the Bill & Melinda Gates Foundation, not surprisingly as a VAM advocate, advancing a series of highly false claims about the wonderful potentials of VAMs. Potentials that, once again, did not pass any sort of peer review, but that still made it to the US Congress. To read about the many methodological and other problems with the MET studies click here.

If I was to make a list of VAMboozlers, Kane would be near the top of the list, especially as he is increasingly using his Harvard affiliation to advance his own (profitable) credibility in this area. To read an insightful post about just this, read VAMboozled! reader Laura Chapman’s comment at the bottom of a recent post here, in which she wrote, “Harvard is only one of a dozen high profile institutions that has become the source of propaganda about K-12 education and teacher performance as measured by scores on standardized tests.”

Anyhow, and as per a recent article in the Los Angeles Times, Kane testified that “Black and Latino students are more likely to get ineffective teachers in Los Angeles schools than white and Asian students,” and that “the worst teachers–in the bottom 5%–taught 3.2% of white students and 5.4% of Latino students. If ineffective teachers were evenly distributed, you’d expect that 5% of each group of students would have these low-rated instructors.” He concluded that “The teaching-quality imbalance especially hurts the neediest students because ‘rather than assign them more effective teachers to help close the gap with white students they’re assigned less effective teachers, which results in the gap being slightly wider in the following year.”

Kane’s research was, of course, used to support the claim that bad teachers are causing the disparities that he cited, regardless of the fact the inverse could be also, equally, or even more true–that the value-added measures used to measure teacher effectiveness in these schools are biased by the very nature of the students in these schools that are contributing their low test scores to such estimates. As increasingly being demonstrated in the literature, these models are biased by the types of students in the classrooms and schools that contribute to the measures themselves.

So which one came first? The chicken or the egg? The question here, really, and that I wish defendants would have posed, was whether the students in these schools caused such teachers to appear less effective when in fact they might have been as equally effective as “similar” teachers teaching more advantaged kids across town. What we do know from the research literature is that, indeed, there are higher turnover rates in such schools, and oftentimes such schools become “dumping grounds” for teachers who cannot be terminated due to such tenure laws – this is certainly a problem. But to claim that teachers in such schools are causing poor achievement is certainly cause for concern, not to mention a professional and research-based ethics concern as well.

Kane’s “other notable finding was that the worst teachers in Los Angeles are doing more harm to students than the worst ones in other school systems that he compared. The other districts were New York City, Charlotte-Mecklenberg, Dallas, Denver, Memphis and Hillsborough County in Florida.” Don’t ask me how he figured that one out, across states that use different tests, different systems, and have in their schools entirely different and unique populations. Amazing what some economists can accomplish with advanced mathematical models…and just a few (heroic) assumptions.

David Berliner’s “Thought Experiment”

My main mentor, David Berliner (Regents Professor at Arizona State University) wrote a “Thought Experiment” that Diane Ravitch posted on her blog yesterday. I have pasted the full contents here for those of you who may have missed it. Do take a read, and play along and see if you can predict which state will yield higher test performance in the end.

—–

Let’s do a thought experiment. I will slowly parcel out data about two different states. Eventually, when you are nearly 100% certain of your choice, I want you to choose between them by identifying the state in which an average child is likely to be achieving better in school. But you have to be nearly 100% certain that you can make that choice.

To check the accuracy of your choice I will use the National Assessment of Educational Progress (NAEP) as the measure of school achievement. It is considered by experts to be the best indicator we have to determine how children in our nation are doing in reading and mathematics, and both states take this test.

Let’s start. In State A the percent of three and four year old children attending a state associated prekindergarten is 8.8% while in State B the percent is 1.7%. With these data think about where students might be doing better in 4th and 8th grade, the grades NAEP evaluates student progress in all our states. I imagine that most people will hold onto this information about preschool for a while and not yet want to choose one state over the other. A cautious person might rightly say it is too soon to make such a prediction based on a difference of this size, on a variable that has modest, though real effects on later school success.

So let me add more information to consider. In State A the percent of children living in poverty is 14% while in State B the percent is 24%. Got a prediction yet? See a trend? How about this related statistic: In State A the percent of households with food insecurity is 11.4% while in State B the percent is 14.9%. I also can inform you also that in State A the percent of people without health insurance is 3.8% while in State B the percent is 17.7%. Are you getting the picture? Are you ready to pick one state over another in terms of the likelihood that one state has its average student scoring higher on the NAEP achievement tests than the other?

​If you still say that this is not enough data to make yourself almost 100% sure of your pick, let me add more to help you. In State A the per capita personal income is $54,687 while in state B the per capita personal income is $35,979. Since per capita personal income in the country is now at about $42,693, we see that state A is considerably above the national average and State B is considerably below the national average. Still not ready to choose a state where kids might be doing better in school?

Alright, if you are still cautious in expressing your opinions, here is some more to think about. In State A the per capita spending on education is $2,764 while in State B the per capita spending on education is $2,095, about 25% less. Enough? Ready to choose now?
Maybe you should also examine some statistics related to the expenditure data, namely, that the pupil/teacher ratio (not the class sizes) in State A is 14.5 to one, while in State B it is 19.8 to one.

As you might now suspect, class size differences also occur in the two states. At the elementary and the secondary level, respectively, the class sizes for State A average 18.7 and 20.6. For State B those class sizes at elementary and secondary are 23.5 and 25.6, respectively. State B, therefore, averages at least 20% higher in the number of students per classroom. Ready now to pick the higher achieving state with near 100% certainty? If not, maybe a little more data will make you as sure as I am of my prediction.

​In State A the percent of those who are 25 years of age or older with bachelors degrees is 38.7% while in State B that percent is 26.4%. Furthermore, the two states have just about the same size population. But State A has 370 public libraries and State B has 89.
Let me try to tip the data scales for what I imagine are only a few people who are reluctant to make a prediction. The percent of teachers with Master degrees is 62% in State A and 41.6% in State B. And, the average public school teacher salary in the time period 2010-2012 was $72,000 in State A and $46,358 in State B. Moreover, during the time period from the academic year 1999-2000 to the academic year 2011-2012 the percent change in average teacher salaries in the public schools was +15% in State A. Over that same time period, in State B public school teacher salaries dropped -1.8%.

I will assume by now we almost all have reached the opinion that children in state A are far more likely to perform better on the NAEP tests than will children in State B. Everything we know about the ways we structure the societies we live in, and how those structures affect school achievement, suggests that State A will have higher achieving students. In addition, I will further assume that if you don’t think that State A is more likely to have higher performing students than State B you are a really difficult and very peculiar person. You should seek help!

So, for the majority of us, it should come as no surprise that in the 2013 data set on the 4th grade NAEP mathematics test State A was the highest performing state in the nation (tied with two others). And it had 16 percent of its children scoring at the Advanced level—the highest level of mathematics achievement. State B’s score was behind 32 other states, and it had only 7% of its students scoring at the Advanced level. The two states were even further apart on the 8th grade mathematics test, with State A the highest scoring state in the nation, by far, and with State B lagging behind 35 other states.

Similarly, it now should come as no surprise that State A was number 1 in the nation in the 4th grade reading test, although tied with 2 others. State A also had 14% of its students scoring at the advanced level, the highest rate in the nation. Students in State B scored behind 44 other states and only 5% of its students scored at the Advanced level. The 8th grade reading data was the same: State A walloped State B!

States A and B really exist. State B is my home state of Arizona, which obviously cares not to have its children achieve as well as do those in state A. It’s poor achievement is by design. Proof of that is not hard to find. We just learned that 6000 phone calls reporting child abuse to the state were uninvestigated. Ignored and buried! Such callous disregard for the safety of our children can only occur in an environment that fosters, and then condones a lack of concern for the children of the Arizona, perhaps because they are often poor and often minorities. Arizona, given the data we have, apparently does not choose to take care of its children. The agency with the express directive of insuring the welfare of children may need 350 more investigators of child abuse. But the governor and the majority of our legislature is currently against increased funding for that agency.

State A, where kids do a lot better, is Massachusetts. It is generally a progressive state in politics. To me, Massachusetts, with all its warts, resembles Northern European countries like Sweden, Finland, and Denmark more than it does states like Alabama, Mississippi or Arizona. According to UNESCO data and epidemiological studies it is the progressive societies like those in Northern Europe and Massachusetts that care much better for their children. On average, in comparisons with other wealthy nations, the U. S. turns out not to take good care of its children. With few exceptions, our politicians appear less likely to kiss our babies and more likely to hang out with individuals and corporations that won’t pay the taxes needed to care for our children, thereby insuring that our schools will not function well.

But enough political commentary: Here is the most important part of this thought experiment for those who care about education. Everyone of you who predicted that Massachusetts would out perform Arizona did so without knowing anything about the unions’ roles in the two states, the curriculum used by the schools, the quality of the instruction, the quality of the leadership of the schools, and so forth. You made your prediction about achievement without recourse to any of the variables the anti-public school forces love to shout about –incompetent teachers, a dumbed down curriculum, coddling of students, not enough discipline, not enough homework, and so forth. From a few variables about life in two different states you were able to predict differences in student achievement test scores quite accurately.

I believe it is time for the President, the Secretary of Education, and many in the press to get off the backs of educators and focus their anger on those who will not support societies in which families and children can flourish. Massachusetts still has many problems to face and overcome—but they are nowhere as severe as those in my home state and a dozen other states that will not support programs for neighborhoods, families, and children to thrive.

This little thought experiment also suggests also that a caution for Massachusetts is in order. It seems to me that despite all their bragging about their fine performance on international tests and NAEP tests, it’s not likely that Massachusetts’ teachers, or their curriculum, or their assessments are the basis of their outstanding achievements in reading and mathematics. It is much more likely that Massachusetts is a high performing state because it has chosen to take better care of its citizens than do those of us living in other states. The roots of high achievement on standardized tests is less likely to be found in the classrooms of Massachusetts and more likely to be discovered in its neighborhoods and families, a refection of the prevailing economic health of the community served by the schools of that state.

One Teacher’s Dystopian Reality

Chris Gilbert, an English teacher from North Carolina, a state that uses the well-known and widely used (and also proprietary) Education Value-Added Assessment System (EVAAS) emailed the other day, sharing two articles he wrote for the Washington Post, on behalf of his fellow teachers, about his experiences being evaluated using the EVAAS system.

This one (click here) is definitely worth a full read, especially because this one comes directly from an educator living out VAMs in practice, in the field, and in what he terms his dystopian reality.

He writes: “In this dystopian story, teachers are evaluated by standardized test scores and branded with color-coded levels of effectiveness, students are abstracted into inhuman measures of data, and educational value is assessed by how well forecasted “growth” levels are met. Surely, this must be a fiction.”

 

Stanford Professor Darling-Hammond on America’s Testing Fixation and Frenzy

Just recently on National Public Radio (NPR), current Stanford Professor and former runner-up to being appointed by President Obama as the US Secretary of Education (Obama appointed current secretary Arne Duncan instead) Linda Darling-Hammond was interviewed about why she thought “School Testing Systems Should Be Examined In 2014.”

Her reasons?

  • Post No Child Left Behind (that positioned states as the steroids of educational reform) America’s public schools have seen no substantive changes, or more specifically gains for the better, as intended. According to Darling-Hammond, “We’re actually not doing any better than we were doing a decade ago” when NCLB was first passed into legislation (2002).
  • “When No Child Left Behind was passed back in 2002, there was a target set for each year for each school [in each state] that [students in each state] would get to a place where 100 percent of students would be, quote/unquote ‘proficient’ on the state tests. Researchers knew even then that would be impossible.” Accordingly, and in many ways unfortunately, all states have since failed to meet this target (i.e., 100% proficiency), as falsely assumed, and predicted, and used as political rhetoric to endorse and pass NCLB by both republicans and democrats, making for one of the first bipartisan educational policies of its time.
  • “Testing has some utility, if you use it in thoughtful ways” but in our country we are lacking serious thought and consideration about that which tests can and should do versus that which they cannot and should not do.

Moving forward we MUST “change our policies around the nature of testing, the amount of testing, and the uses of testing…and move [forward] from a test-and-punish philosophy – which was the framework for No Child Left Behind…to an assess-and-improve philosophy.” More importantly, we MUST “address childhood poverty” as this, not testing and holding teachers accountable for their students’ test scores, is where true reform should be positioned

While “certainly [a] good education is a [good] way out of poverty…we [still must] address some of these issues that adhere to poverty itself” and perpetually cause (and are significantly and substantially related to) low levels of student learning and low levels of achievement. “The two are completely intertwined, and we have to work on both at the same time.”

Stanford Professor, Dr. Edward Haertel, on VAMs

In a recent speech and subsequent paper written by Dr. Edward Haertel – National Academy of Education member and Professor at Stanford University – he writes about VAMs and the extent to which VAMs, being based on student test scores, can be used to make reliable and valid inferences about teachers and teacher effectiveness. This is a must-read, particularly for those out there who are new to the research literature in this area. Dr. Haertel is certainly an expert here, actually one of the best we have, and in this piece he captures the major issues well.

Some of the issues highlighted include concerns about the tests used to model value-added and how their scales (falsely assumed to be as objective and equal as units on a measuring stick) complicate and distort VAM-based estimates. He also discusses the general issues with the tests almost if not always used when modeling value-added (i.e., the state-level tests mandated as per No Child Left Behind in 2002).

He discusses why VAM estimates are least trustworthy, and most volatile and error prone, when used to compare teachers who work in very different schools with very different student populations – students who do not attend schools in randomized patterns and who are rarely if ever randomly assigned to classrooms. The issues with bias, as highlighted by Dr. Haertel and also in a recent VAMboozled! post with a link to a new research article here, are probably the most major VAM-related, problems/issues going. As captured in his words, “VAMs will not simply reward or penalize teachers according to how well or poorly they teach. They will also reward or penalize teachers according to which students they teach and which schools they teach in” (Haertel, 2013, p. 12-13).

He reiterates issues with reliability, or a lack thereof. As per one research study he cites, researchers found that “a minimum of 10% of the teachers in the bottom fifth of the distribution one year were in the top fifth the next year, and conversely. Typically, only about a third of 1 year’s top performers were in the top category again the following year, and likewise, only about a third of 1 year’s lowest performers were in the lowest category again the following year. These findings are typical [emphasis added]…[While a] few studies have found reliabilities around .5 or a little higher…this still says that only half the variation in these value-added estimates is signal, and the remainder is noise [and/or error, which makes VAM estimates entirely invalid about half of the time]” (Haertel, 2013, p. 18).

Dr. Haertel also discusses other correlations among VAM estimates and teacher observational scores, VAM estimates and student evaluation scores, and VAM estimates taken from the same teachers at the same time but using different tests, all of which also yield abysmally (and unfortunately) low correlations, similar to those mentioned above.

His bottom line? “VAMs are complicated, but not nearly so complicated as the reality they are intended to represent” (Haertel, 2013, p. 12). They just do not measure well what so many believe they measure so very well.

Again, to find out more reasons and more in-depth explanations as to why, click here for the full speech and subsequent paper.

Random Assigment and Bias in VAM Estimates – Article Published in AERJ

“Nonsensical,” “impractical,” “unprofessional,” “unethical,” and even “detrimental” – these are just a few of the adjectives used by elementary school principals in Arizona to describe the use of randomized practices to assign students to teachers and classrooms. When asked whether principals might consider random assignment practices, one principal noted, “I prefer careful, thoughtful, and intentional placement [of students] to random. I’ve never considered using random placement. These are children, human beings.” Yet the value-added models (VAMs) being used in many states to measure the “valued-added” by individual teachers to their students’ learning assume that any school is as likely as any other school, and any teacher is as likely as any other teacher, to be assigned any student who is as likely as any other student to have similar backgrounds, abilities, aptitudes, dispositions, motivations, and the like.

One of my doctoral students – Noelle Paufler – and I recently reported in the highly esteemed American Educational Research Journal the results of a survey administered to all public and charter elementary principals in Arizona (see the online publication of “The Random Assignment of Students into Elementary Classrooms: Implications for Value-Added Analyses and Interpretations”). We examined the various methods used to assign students to classrooms in their schools, the student background characteristics considered in nonrandom placements, and the roles teachers and parents play in the placement process. In terms of bias, the fundamental question here was whether the use of nonrandom student assignment practices might lead to biased VAM estimates, if the nonrandom student sorting practices went beyond that which is typically controlled for in most VAM models (e.g., academic achievement and prior demonstrated abilities, special education status, ELL status, gender, giftedness, etc.).

We found that overwhelmingly, principals use various placement procedures through which administrators and teachers consider a variety of student background characteristics and student interactions to make placement decisions. In other words, student placements are by far nonrandom (contrary to the methodological assumptions to which VAM consumers often agree).

Principals frequently cited interactions between students, students’ peers, and previous teachers as justification for future placements. Principals stated that students were often matched with teachers based on their individual learning styles and respective teaching strengths. Parents also yielded considerable control over the placement process with a majority of principals stating that parents made placement requests, the majority of which are often honored.

In addition, in general, principal respondents were greatly opposed to using random student assignment methods in lieu of placement practices based on human judgment—practices they collectively agreed were in the best interest of students. Random assignment, even if necessary to produce unbiased VAM-based estimates, was deemed highly “nonsensical,” “impractical,” “unprofessional,” “unethical,” and even “detrimental” to student learning and teacher success.

The nonrandom assignment of students to classrooms has significant implications for the use of value-added models to estimate teacher effects on student learning using large-scale standardized test scores. Given the widespread use of nonrandom methods as indicated in this study, however, value-added researchers, policymakers, and educators should carefully consider the implications of their placement decisions as well as the validity of the inferences made using value-added estimates of teacher effectiveness.

Why VAMs & Merit Pay Aren’t Fair

An “oldie” (i.e., published about one year ago), but a goodie! This one is already posted in the video gallery of this site, but it recently came up again as a good, short-at-three minutes, video version, that captures some of the main issues.
Check it out and share as (so) needed!

Six Reasons Why VAMs and Merit Pay Aren’t Fair