VAMs at the Value-Added Research Center (VARC)

Following up from our last post including Professor Haertel’s analysis of the “Oak Tree” video, produced and disseminated by the Value-Added Research Center (VARC) affiliated with the Wisconsin Center for Education Research at the University of Wisconsin-Madison, I thought I would follow-up, as also requested by the same VAMboozled! reader, a bit more about VARC and what I know about this organization and their VAM.

Dr. Robert H. Meyer founded VARC in 2004 and currently serves as VARC’s Research Director. Accordingly, VARC’s value-added model is also known as Meyer’s model, just as the EVAAS® is also known as Sanders’s model.

Like with the EVAAS®, VARC has a mission to perform ground-breaking work on value-added systems, as well as to conduct value-added research to evaluate the effectiveness of teachers (and schools/districts) and educational programs and policies. Unlike with the EVAAS®, however, VARC describes its methods as transparent. Although, there is actually more information about the inner workings of the EVAAS® model on the SAS website and via other publications than there is about the VARC model and its methods, this is likely due to the relative youth of the VARC model, as VARC is currently at year three in terms of model development and implementation (VARC, 2012c).

Nonetheless, VARC has a “research-based philosophy,” and VARC officials have stated that one of their missions is to publish VARC work in peer-reviewed, academic journals (Meyer, 2012). VARC has ostensibly made publishing in externally reviewed journals a priority, possibly because of the presence of the academics within VARC, as well as its affiliation with the University of Wisconsin, Madison. However, very few studies have been published to date about the model and its effectiveness, again likely given its infancy. Instead (like with the EVAAS®), the Center has disproportionally produced and disseminated technical reports, white papers, and presentations, all of which (like with the EVAAS®) seem to also be disseminated for marketing and other informational purposes, including the securing of additional contracts. Unfortunately, a commonality across the two models is that they both seem bent on implementation before validation.

Regardless, VARC defines its methods as “collaborative” given that VARC researchers have worked with school districts, mainly in Milwaukee and Madison, to help them better build and better situate their value-added model within the realities of districts and schools (VARC, 2012c). As well, VARC defines its value-added model as “fair.” What this means remains unclear. Otherwise, and again, little is still known about the VARC model itself, including its strengths and weaknesses.

But I would bet some serious cash the model, like the others, has the same or similar issues as all other VAMs. To review these issues, please click here to (re)read the very first post on VAMboozled! (October 30, 2013), about these general but major issues.

Otherwise, he are some additional specifics:

  • The VARC model uses generally accepted research methods (e.g., hierarchical linear modeling) to purportedly measure and evaluate the contributions that teachers (and schools/districts) make to student learning and achievement over time.
  • VARC compares individual students to students who are like them by adjusting the statistical models using the aforementioned student background factors. Unlike the EVAAS®, however, VARC does make modifications for student background variables that are outside of a teacher’s (or school’s/district’s) direct control.
  • VARC controls include up to approximately 30 variables including the standard race, gender, ethnicity, levels of poverty, students’ levels of English language proficiency, and special education statuses. VARC also uses other variables when available including, for example, student attendance, suspension, retention records and the like. For this, and other reasons, and according to Meyer, this helps to make the VARC model “arguably one of the best in the country in terms of attention to detail.
  • Then (like with the EVAAS®) whether students whose growth scores are aggregated at the teacher (or school/district) levels statistically exceed, meet, or fall below their growth projections (i.e., also above or below one standard deviation from the mean) helps to determine teachers’ (or schools’/districts’) value-added scores and subsequent rankings and categorizations. Again, these are relatively determined depending on where other teachers (or schools/districts) ultimately land, and they are based on the same assumption that effectiveness is the average of the teacher (or school/district) population.
  • Like with the EVAAS®, VARC also does this work with publicly subsidized monies, although, in contrast to SAS®, VARC is a non-profit organization.
  • Given my best estimates, VARC is currently operating 25 projects exceeding a combined $28 million (i.e., $28,607,000) given federal (e.g., from the U.S. Department of Education, Institute for Education Sciences, National Science Foundation), private (e.g., from Battelle for Kids, The Joyce Foundation, The Walton Foundation), and state and district funding.
  • VARC is currently contracting with the state departments of education in Minnesota, New York, North Dakota, South Dakota, and Wisconsin. VARC is also contracting with large school districts in Atlanta, Chicago, Dallas, Fort Lauderdale, Los Angeles, Madison, Milwaukee, Minneapolis, New York City, Tampa/St. Petersburg, and Tulsa.
  • Funding for the 25 projects currently in operation ranges from the lowest, short-termed, and smallest-scale $30,000 project to the highest, longer-termed, and larger-scale $4.2 million project.
  • Across the grants that have been funded, regardless of type, the VARC projects currently in operation are funded at an average of $335,000 per year with an average funding level just under $1.4 million per grant.
  • It is also evident that VARC is expanding its business rapidly across the nation. In 2004 when the center was first established, VARC was working with less than 100,000 students across the country. By 2010 this number increased 16-fold; VARC was then working with data from approximately 1.6 million students in total.
  • VARC delivers sales pitches in similar ways, although those affiliated with VARC do not seem to overstate their advertising claims quite like those affiliated with EVAAS®.
  • Additionally, VARC officials are greatly focused on the use of value-added estimates for data informed decision-making. “All teachers should [emphasis added] be able to deeply understand and discuss the impact of changes in practice and curriculum for themselves and their students.”

Comparing Oak Trees’ “Apples to Apples,” by Stanford’s Edward Haertel

A VAMboozled! follower posted this comment via Facebook the other day: “I was wondering if you had seen this video by The Value-Added Research Center [VARC], called the “Oak Tree Analogy” [it is the second video down]? My children’s school district has it on their web-site. What are your thoughts about VARC, and the video?”

I have my own thoughts about VARC, and I will share these next, but better than that I have somebody else’s much wiser thoughts about this video, as this video has in many ways gone “viral.”

Professor Edward Haertel, School of Education at Stanford University, wrote Linda Darling-Hammond (Stanford), Jesse Rothstein (Berkeley), and me an email a few years ago about just this video. While I could not find the email he eloquently drafted then, I persuaded (aka, begged) him to recreate what he wrote then, here, for all of you.

You might want to watch the video, first, to follow along, or least, to more critically view the contents of the video. You decide, but Professor Haertel writes:

The Value-Added Research Center’s ‘Oak Tree’ analogy is helpful in conveying the theory [emphasis added] behind value-added models. To compare the two gardeners, we adjust away various influences that are out of the gardeners’ control, and then, as with value added, we just assume that whatever is left over must have been due to the gardener.  But, we can draw some important lessons from this analogy in addition to those highlighted in the presentation.

In the illustration, the overall effect of rainfall was an 8-inch difference in annual growth (+3 inches for one gardener’s location; -5 for the other). Effects of soil and temperature, in one direction or the other, were 5 inches and 13 inches. But the estimated effect of the gardeners themselves was only a 4-inch difference. 

As with teaching, the value-added model must sort out a small “signal” from a much larger amount of “noise” in estimating the effects of interest. It follows that the answer obtained may depend critically on just what influences are adjusted for. Why adjust for soil condition? Couldn’t a skillful gardener aerate the soil or amend it with fertilizer? If we adjust only for rainfall and temperature then Gardener B wins. If we add in the soil adjustment, then Gardener A wins. Teasing apart precisely those factors for which teachers justifiably should be held accountable versus those beyond their control may be well-nigh impossible, and if some adjustments are left out, the results will change. 

Another message comes from the focus on oak tree height as the outcome variable.  The savvy gardener might improve the height measure by removing lower limbs to force growth in just one direction, just as the savvy teacher might improve standardized test scores by focusing instruction narrowly on tested content. If there are stakes attached to these gardener comparisons, the oak trees may suffer.

The oak tree height analogy also highlights another point. Think about the problem of measuring the exact height of a tree—not a little sketch on a PowerPoint slide, but a real tree. How confidently could you say how tall it was to the nearest inch?  Where, exactly, would you put your tape measure? Would you measure to the topmost branch, the topmost twig, or the topmost leaf? On a sunny day, or at a time when the leaves and branches were heavy with rain?

The oak tree analogy does not discuss measurement error. But one of the most profound limitations of value-added models, when used for individual decision making, is their degree of error, referred to technically as low reliability. Simply put, if we compare the same two gardeners again next year, it’s anyone’s guess which of the two will come out ahead.”

Thanks are very much in order, Professor Haertel, for having “added value” to the conversations surrounding these issues, and, helping us collectively understand the not-so-simple theory advanced via this video.

Same Teachers, Similar Students, Similar Tests, Different Answers

One of my favorite studies to date about VAMs was conducted by John Papay, an economist once at Harvard and now at Brown University. In the study titled “Different Tests, Different Answers: The Stability of Teacher Value-Added Estimates Across Outcome Measures” published in 2009 by the 3rd best and most reputable peer-reviewed journal, American Educational Research Journal, Papay presents evidence that different yet similar tests (i.e., similar on content, and similar on when the tests were administered to similar sets of students) do not provide similar answers about teachers’ value-added performance. This is an issue with validity, in that, if a test is measuring the same things for the same folks at the same times, similar-to-the-same results should be realized. But they are not. Papay, rather, found moderate-sized rank correlations, ranging from r=0.15 to r=0.58, among the value-added estimates derived from the different tests.

Recently released, yet another study (albeit not yet peer-reviewed) has found similar results…potentially solidifying this finding further into our understandings about VAMs and their issues, particularly in terms of validity (or truth in VAM-based results). This study on “Comparing Estimates of Teacher Value-Added Based on Criterion- and Norm-Referenced Tests” released by the U.S. Department of Education and conducted by four researchers representing Notre Dame University, Basis Policy Research, and American Institutes of Research, provides evidence, again, that estimates of teacher value-added as based on different yet similar tests (i.e., in this case a criterion-referenced state assessment and a widely used norm-referenced test given in the same subject around the same time) yielded moderately correlated estimates of teacher-level value added, yet again.

If we had confidence in the validity of the inferences based on value-added measures, these correlations (or more simply put “relationships”) should be much higher than what they found, similar to what Papay found, in the range of 0.44 to 0.65. While the ideal correlation coefficient is a, in this case, r=+1.0, that is very rarely achieved. But for the purposes for which teacher-level value-added is currently being used, correlations above r=+.70/r=+.80 would (and should) be most desired, and possibly required before high-stakes decisions about teachers are to be made as based on these data.

In addition, researchers in this study found that on average, only 33.3% of teachers’ estimates from both sets of value-added estimates positioned them in the same range of scores (using quintiles or ranges including 20% bands of width) on both tests in the same school year. This too has implications for validity in that, again, teachers or teachers’ value-added estimates should fall in the same ranges, if and when using similar tests, if any valid inferences are to be made using value-added estimates.

Again, please note that this study has not yet been peer-reviewed. While some results naturally seem to make more sense, like the one reviewed above, peer-review matters equally for those with results with which we might tend to agree and those with results we might tend to reject. Take anything for that matter that is not peer-reviewed, as just that…a study with methods and findings not critically vetted by the research community.

The Trojan Horse in South Carolina

Following one of our most recent posts re: Rhee in South Carolina, a reader (Ian Kay) wrote in the following comment I thought important to share out with the rest of you.

Ian writes:

[Rhee] will succeed here in south Carolina, because she has the backing of the monied interests, including Gates, and Jeff Bezos and others. The newspapers will not print any criticism of this bunch, because they are in collusion with the goals of this movement to denigrate the teaching profession, and reap the money which is always there, in good times and in bad. I tried many times to warn the community here in Charleston about this fraud, to no avail. Here is a sample of the kind of letter to editor which they will not print. The truth from a veteran teacher will not see the light of day [until now – thanks for writing!].

The Trojan Horse is Here.

As we veteran teachers predicted, the death knell for public education is spreading slowly and inexorably throughout the country, and as the Post and Courier reported last week, Michelle Rhee and her for-profit destroyers have claimed two more districts with their fraudulent promise to improve the educational system. The State Board of Education, obviously with the blessings of the Superintendent, has approved the extension of the “non-profit” Teach for America program by placing two more districts here in the low country within their grasp. If the public believes that this is a non-profit agenda, I still have that Brooklyn Bridge for sale at a reasonable price. I wonder how much they promised the board, the Superintendent, and all the palms they had to grease to get this program on board? I am a 28 year veteran teacher who stayed the course, and is no longer intimidated by administrators and fraudulent educational leaders. What can they do to me, stop my social security check? It’s time that someone tells the public what is really going on in the school systems, and in these contrived fiefdoms known as districts, that not only abuse the teaching staff, but waste taxpayer money to fix what they have caused in the first place. Where is the outrage on the part of teachers to the article detailing the hiring of outsourced substitutes to Kelly Educational Staffing. The School District says that they want to expand the concept to the entire district this fall. Human Resources director Julie Rogers is quoted as saying that “They do a really good job of training their substitutes. It’s not just a warm body, they get in there and teach” This is proof positive that the present educational leadership is not capable of operating a school system. Where is the indignation from the educational departments of our local colleges that train teachers? I could rail for a long time, but you get the picture I’m sure. It’s sickening and it’s time to return the system to some normalcy, and discard these frauds posing as educational leaders, before the entire system implodes.

Ian Kay
West Ashley

On a More Positive Note…

Contrary to what is going on in South Carolina and its House Bill 4419 (thanks in large part to Rhee and her “supportive” efforts, as posted also today here), in the state of Washington its State Senate just voted “No” against evaluating teachers in the state using student test scores.

Interesting that this happened in Bill Gates’s homestead state, given his current and widespread educational “reform” initiatives in support of the opposite, but I digress.

As per at least one of the articles capturing this story, Washington State Senate Bill 5246 failed by a 28-19 vote, even though Washington (like many other states) has a federal No Child Left Behind (NCLB) waiver requiring the state to mandate the use of such tests for such evaluation systems. The state could now lose an undisclosed (and likely unknown) amount of federal funding, by not complying with the aforementioned condition of the NCLB waiver.

One of the State Senators said that she voted the bill down because “using state tests to measure student growth has not been proven to be an effective way to judge teachers.”

It’s about time!!! Silver linings, perhaps, but something certainly to celebrate!

But not for another State Senator who voted for it, particularly noting that “Losing the waiver would mean nearly every school in the state would have to send a letter home to parents saying they are failing to meet the requirements of the federal education law.”

Is this not something else to celebrate? Depends on your stance as a leader, I guess.

Rhee Coming to AZ?!?

Speaking of Rhee, mentioned in a prior post today, it seems that she is to be in my state of Arizona, as part of her most recent “tour.”

Arizona, continuously basking in the glow of negative attention, coming largely from the uber conservative policies supported by our current Republican Governor Jan Brewer, is to potentially have another gubernatorial leader – Scott Smith.

I was sent, in secret, an invitation to a “Reception in Honor” of gubernatorial candidate Smith…and his special guest of the evening…[insert drum roll]…Michelle Rhee.

The party is to celebrate their collective stance towards “Efficient, Effective & Accountable Leadership for Arizona.” Sound familiar?

Anyhow, a ticket to this event comes at a cost of a minimum campaign contribution set at $250 per person to a maximum campaign contribution of $4,000 per person. The event is to be held tonight, Friday, February 21, from 5:00-6:00 (yes, for one hour) at The Montelucia Resort in a wealthy suburb within the metropolitan Phoenix area called Paradise Valley.

It seems wealth is a politician’s common denominator of power, now doesn’t it?

On that note, I heard and then verified recently, that current AZ Governor Brewer’s highest level of education is that of an x-ray technician with her highest degree earned being a radiological technologist certificate. That too says a lot about the power of money and the power money can bring for (too often, in the case of Arizona) some very, let’s say, misguided policies. Thanks to Rhee, she can help my state in its misguidedness.

As related to Arizona’s teacher accountability system in particular, so far, the state department of education has at least tried to maintain some sanity in terms of its teacher accountability and related VAM-based policies, leaving much of this to be determined by and in the hands of districts and schools who still very much honor and appreciate their local control. I like to think that I have had at least a little to do with this current stance, while not ideal but reasonable given the current sociopolitical circumstances of my state.

But it looks like if this candidate wins, we might be in way worse shape than we are now, despite our best intentions…and, again, with special thanks to policy clots like Rhee.

Michelle Rhee Rhumbuggery

Michelle Rhee who, as many of you know, is the founder and current CEO of StudentsFirst, as well as former Chancellor of Washington D.C.’s public schools who during her tenure there, enacted a strict, controversial teacher evaluation system (i.e., IMPACT) that has been at the source of different posts here and here, most recently following the “gross” errors in 44 D.C. public school teachers’ evaluation scores.

Well, it now seems that she is on tour. She just recently testified before South Carolina’s K-12 Subcommittee of the House Education and Public Works Committee, specifically  “about legislation regarding improving teacher evaluations and rewarding effective teachers in the public school system in S.C.” It worked so well in D.C., right?!?

This “rock star,” yet again, brought with her her one string guitar, speaking in favor of South Carolina’s House Bill 4419 that “addresses the way teachers are evaluated [i.e., using VAMs], rewards effective teachers with recognition and the opportunity to earn higher salaries [i.e., merit pay] and gives school leaders the opportunity and the tools to build and maintain a quality team of teachers [i.e., humbuggery, for the lack of a more colorful term].”

She also brought with her her typical and emotive chorus lines:

  • “[M]ore and more state legislators are beginning to understand the crisis that we are in in America as it pertains to our public education system.”
  • “[A]s a nation, we are not doing everything that we should be doing to ensure that we are providing all children with the excellent education that they deserve.”
  • “[C]hildren who are in school today will be the first generation of Americans to be less well educated than their parents were for the first time in the history of our country.”
  • The U.S. has “recently dropped to becoming about 15th, 17th and 26th in reading, science and math (respectively).”
  • “Instead of competing against students across the country for jobs, students in U.S. public schools will be competing for jobs against children in Singapore, China and Korea.”

And her best one-liner yet.

  • “In so many school districts across the country, you’ll hear stories about how people will come in, you know, as long as you pass the criminal check, and you’ve got a pulse, you can get a job in the classroom, and then once you have that job, you know, you have that job forever.”

Rhetoric or reality? I would never intend to insult the intelligence of the readers of this blog, so I will just leave it at that.

But it seems that at least one educator presented a counterpoint to Rhee’s testimony, although he unfortunately applauded some of Rhee’s efforts and used more gentle tactics to dissuade the bill’s passing.

A local professor, however, noted that state legislators should really “read the research [emphasis added] and bring experts and educators in on the conversation before a decision is made [emphasis added].”

I sure do hope legislators at least begin to heed this call, and begin to deconstruct and better understand her humbuggery, or more specifically, the fraudulent soundbites Rhee continues to use to advance her highly misguided educational policies and vested interests.

“Significant Flaws” with New York’s Teacher Evaluation Data

New York “State officials admit problems with new teacher ratings” is how the headline reads. More specifically, it seems that New York State Education Department officials are acknowledging “significant flaws” with their new system of rating more than 126,000 teachers statewide by their effectiveness, and these errors are causing the state to push back the public release of these data.

While the state did not disclose any details to help define with what types of “significant flaws” the state was dealing, anecdotal evidence suggests there may be rating errors across similar tests and scoring errors whereby teachers who score consistently across subject areas are scoring below par in their overall categorization/ranking.

In addition, this release is only to be pushed back until around March, even though the data to be released are about teacher performance from the 2012-13 school year. Yes, teachers will be getting their data this March from last year.

Here’s the deal on this one. One of the biggest drawbacks of such teacher evaluation systems is that they have literally no instrumental value; that is, no states across the country have yet figured out how to use these data for instrumental or change-based purposes, to inform the betterment of schools, teacher quality, and most importantly students’ learning and achievement, and no states yet have plans to make these data useful. These systems are 100% about accountability and a symbolic accountability more accurately that, again, has little to no instrumental value. No peer-reviewed studies, for example, have demonstrated that having these data actually improves, not to mention does much of anything for schools, since such data systems have been implemented. This is largely due to a lack of transparency in these systems, high levels of confusion when practitioners try to consume and use these data (many times because the data reported are far removed from the realities and content particulars they teach), and issues like this. Oftentimes, by the time teachers get their evaluation reports, students are well on their way in subsequent grades, and in this case almost onto two grade levels later.

Follow-Up On Kane’s Testimony

As a follow-up to our most recent post, for those of you interested in reading more about Kane’s testimony in the Vergara v. California case, please see the following post written by John Thompson in another blog titled “Does the Gates Foundation’s Evidence Argue For or Against Vergara?

As the title implies, Thompson connects the dots between the Gates Foundation and the others who are also part of the “Billionaires’ Boys Club” who are supporting (although not directly financing) this case. Thompson also provides a much more thorough and detailed analysis of Kane’s testimony. Do give it a read.

As an aside, I also just happened across a press release announcing then that one of the case’s current star witnesses, who was actually the first called to the stand by the plaintiffs – Los Angeles Superintendent John Deasy – also starting in 2009 served as the Gates Foundation’s Deputy Director of Education. While it does not seem that Deasy is directly affiliated with the Gates Foundation anymore, perhaps this connection had something to do with the choice location of this lawsuit in Los Angeles Unified. In all fairness, however, Deasy is “providing fodder” for both sides, including testimony for the defense about the district’s current labor laws and how firing “ineffective” teachers really “comes down to the choices and competence of management, not the constitutionality of current regulations.”

Chicken or the Egg?

“Which came first, the chicken or the egg?” is an age-old question, but is more importantly a dilemma about identifying the real cases of cause and consequence.

Recall from a recent post that currently in the state of California nine public school students are challenging California’s teacher tenure system, arguing that their right to a good education is being violated by job protections that protect ineffective teachers, but do not protect the students from being instructed by said teachers. Recall, as well, that a wealthy technology magnate [David Welch] is financing the whole case, as also affiliated and backed by Students Matter. The ongoing suit is called “Vergara v. California.”

Welch and Students Matter have thus far brought to testify an all-star cast, most recently including Thomas Kane, an economics professor from Harvard University. Kane also directed the $45 million worth of Measures of Effective Teaching (MET) studies for the Bill & Melinda Gates Foundation, not surprisingly as a VAM advocate, advancing a series of highly false claims about the wonderful potentials of VAMs. Potentials that, once again, did not pass any sort of peer review, but that still made it to the US Congress. To read about the many methodological and other problems with the MET studies click here.

If I was to make a list of VAMboozlers, Kane would be near the top of the list, especially as he is increasingly using his Harvard affiliation to advance his own (profitable) credibility in this area. To read an insightful post about just this, read VAMboozled! reader Laura Chapman’s comment at the bottom of a recent post here, in which she wrote, “Harvard is only one of a dozen high profile institutions that has become the source of propaganda about K-12 education and teacher performance as measured by scores on standardized tests.”

Anyhow, and as per a recent article in the Los Angeles Times, Kane testified that “Black and Latino students are more likely to get ineffective teachers in Los Angeles schools than white and Asian students,” and that “the worst teachers–in the bottom 5%–taught 3.2% of white students and 5.4% of Latino students. If ineffective teachers were evenly distributed, you’d expect that 5% of each group of students would have these low-rated instructors.” He concluded that “The teaching-quality imbalance especially hurts the neediest students because ‘rather than assign them more effective teachers to help close the gap with white students they’re assigned less effective teachers, which results in the gap being slightly wider in the following year.”

Kane’s research was, of course, used to support the claim that bad teachers are causing the disparities that he cited, regardless of the fact the inverse could be also, equally, or even more true–that the value-added measures used to measure teacher effectiveness in these schools are biased by the very nature of the students in these schools that are contributing their low test scores to such estimates. As increasingly being demonstrated in the literature, these models are biased by the types of students in the classrooms and schools that contribute to the measures themselves.

So which one came first? The chicken or the egg? The question here, really, and that I wish defendants would have posed, was whether the students in these schools caused such teachers to appear less effective when in fact they might have been as equally effective as “similar” teachers teaching more advantaged kids across town. What we do know from the research literature is that, indeed, there are higher turnover rates in such schools, and oftentimes such schools become “dumping grounds” for teachers who cannot be terminated due to such tenure laws – this is certainly a problem. But to claim that teachers in such schools are causing poor achievement is certainly cause for concern, not to mention a professional and research-based ethics concern as well.

Kane’s “other notable finding was that the worst teachers in Los Angeles are doing more harm to students than the worst ones in other school systems that he compared. The other districts were New York City, Charlotte-Mecklenberg, Dallas, Denver, Memphis and Hillsborough County in Florida.” Don’t ask me how he figured that one out, across states that use different tests, different systems, and have in their schools entirely different and unique populations. Amazing what some economists can accomplish with advanced mathematical models…and just a few (heroic) assumptions.