EVAAS’s SAS Inc.: “The Frackers of the Educational World”

David Patten, a former history teacher, college instructor, and author, recently wrote an excellent article for the History News Network about VAMs in his state of Ohio (another state that uses the Education Value-Added Assessment System [EVAAS] statewide). He writes about what the state of Ohio is getting in terms of its bang for its buck, at a rate of $2.3 million bucks per year. Just to be clear, this includes the costs to calculate just the state’s value-added estimates, as based on the state’s standardized tests, and this does not include what the state also pays yearly for its standardized tests, in and of themselves.

You can read the full article here, but here are some of the key highlights as they directly pertain to VAMs in Ohio.

Patten explains that Ohio uses a five level model that combines teachers’ EVAAS scores with scores derived via their administrators’ observations into what is a 50/50 teacher evaluation model, that ultimately results in a four category teacher ranking system including the following teacher quality categories: 1. Ineffective, 2. Developing, 3. Skilled, and 4. Accomplished. While Ohio is currently using its state tests, it is soon to integrate and likely replace these with the Common Core tests tests and/or tests purchased from “approved vendors.”

As for the specifics of model, however, he writes that the EVAAS system (as others have written extensively) is pretty “mysterious” beyond that – the more or less obvious.

What exactly is the mathematical formula that will determine the fate of our teachers and our educational systems? Strangely enough, only the creators know the mysterious mix; and they refuse to reveal it.

The dominant corporation in the field of value added is SAS, a North Carolina company. Their Value Added Assessment and Research Manager is Dr. William Sanders [the topic of a previous post here] who is also the primary designer of their model. While working at the University of Tennessee, his remarkable research into agricultural genetics and animal breeding inspired the very model now in use for teacher evaluation. The resultant SAS [EVAAS] formula boasts a proprietary blend of numbers and probabilities. Since it is a closely guarded intellectual property, it becomes the classic enigma wrapped up in taxpayer dollars. As a result, we are urged to take its validity and usefulness as an article of faith. SAS and their ilk have, in fact, become the frackers of the educational world [emphasis added.] They propose to drill into our educational foundations, inject their algorithmic chemicals into our students and instructors, and just like the frackers of the oil and gas world, demand that we trust them to magically get it right.

Strangely enough, Ohio is not worried about this situation. Indeed, no one at the Ohio Department of Education has embraced even the pretense of understanding the value added model it adopted. Quite to the contrary, they admitted to never having seen the complete model, let alone analyzing it. They have told us that it does not matter, for they do not need to understand it. In their own words, they have chosen to “rely upon the expertise of people who have been involved in the field.” Those are remarkable words and admissions and they are completely consistent with an educational bureaucracy sporting the backbone of an éclair.

In terms of dollars and cents, trust comes at a very high price. Ohio will pay SAS, Inc. an annual fee of 2.3 million dollars to calculate value added scores. I found very similar fees in the other states making use of their proprietary expertise.

Should we be afraid of this mystical undertaking? Of course not, instead, we should be terrified. Not only are we stumbling into the dark, unseen room and facing all the horror that implies, but the research into the effectiveness of the model shows it to be as educationally decrepit as the high stakes testing upon which it is based…

…Mark Twain supposedly said, “Sometimes I wonder whether the world is being run by smart people who are just putting us on or by imbeciles who really mean it.” Whether the value added advocates are smart people or imbeciles is unknown to me. What is known to me is that value added has no value. Through it and through standardized testing we have become the architects of an educational system of breathtaking mediocrity. One more thing is abundantly clear; no student and no teacher should ever accept a ride from the “Value Added Valkyries.”

To read more, also about the research Patten highlights to substantiate his claims, again, click here.


This State Pays a Company $2.3 Million to Rank Teachers by an Algorithmic Formula? – See more at: http://www.hnn.us/article/155515#sthash.LhlVpLyH.dpuf


David Patten is an award-winning history teacher, college lecturer, and the author of articles – See more at: http://www.hnn.us/article/155515#sthash.LhlVpLyH.dpuf

David Patten is an award-winning history teacher, college lecturer, and the author of articles – See more at: http://www.hnn.us/article/155515#sthash.LhlVpLyH.dpuf

David Patten is an award-winning history teacher, college lecturer, and the author of articles – See more at: http://www.hnn.us/article/155515#sthash.LhlVpLyH.dpuf



The Houston Chronicle, also on the TVAAS in Tennessee

The Houston Chronicle also recently published an article on the Tennessee Value-Added Assessment System (TVAAS) and what is happening and still ongoing in Tennessee. This is certainly interesting and worth a read, particularly because the Houston Independent School District is using essentially the same value-added model (i.e., the EVAAS) critiqued in this piece about Tennessee.

From the top of the article: “When Tennessee was competing for a half-billion dollars in federal education money, teachers agreed to allow the state to ramp up its use of student test scores for evaluating educators. But since winning the $500 million Race to the Top competition in 2010, teachers say the state has gone too far in using student test scores to assess their performance. They are now calling for legislation to place a moratorium on the use of so-called TVA[A]S scores until a special committee can review them.

Maybe what is being argued and debated in Tennessee will have some carry over effects in Houston as well. We shall see.

What is also interesting to point out, though, is another interesting trend. As explained by a teacher in the article: “She said she’s actually benefited from changes to the teacher evaluation system, such as more constructive feedback because of the increased number of observations.” Almost always it is the case when a counterpoint is needed for an article such as this, that a teacher says they see “value” in the system, but almost if not every time it is because of the increased professional observations of teacher practice, not the value-added component or the value-added data derived. The “formative” or “informative” aspects of these systems have yet to be realized.

A Tennessee Teacher, On the TVAAS and Other Issues of Concern

Check out this 5-minute video to hear from a teacher in Tennessee – the state recognized for bringing to the country value-added models and VAM-based teacher accountability – as she explains who things are going in her state of Tennessee.

Diane Ravitch, in her call to all of us to share out this and other videos/stories such as these, writes that we should help this video, now with over 100,000 views, reach every parent and teacher across the country. “We can be the change,” and social media can help us counter the nonsense expressed so well herein.

VAMs at the Value-Added Research Center (VARC)

Following up from our last post including Professor Haertel’s analysis of the “Oak Tree” video, produced and disseminated by the Value-Added Research Center (VARC) affiliated with the Wisconsin Center for Education Research at the University of Wisconsin-Madison, I thought I would follow-up, as also requested by the same VAMboozled! reader, a bit more about VARC and what I know about this organization and their VAM.

Dr. Robert H. Meyer founded VARC in 2004 and currently serves as VARC’s Research Director. Accordingly, VARC’s value-added model is also known as Meyer’s model, just as the EVAAS® is also known as Sanders’s model.

Like with the EVAAS®, VARC has a mission to perform ground-breaking work on value-added systems, as well as to conduct value-added research to evaluate the effectiveness of teachers (and schools/districts) and educational programs and policies. Unlike with the EVAAS®, however, VARC describes its methods as transparent. Although, there is actually more information about the inner workings of the EVAAS® model on the SAS website and via other publications than there is about the VARC model and its methods, this is likely due to the relative youth of the VARC model, as VARC is currently at year three in terms of model development and implementation (VARC, 2012c).

Nonetheless, VARC has a “research-based philosophy,” and VARC officials have stated that one of their missions is to publish VARC work in peer-reviewed, academic journals (Meyer, 2012). VARC has ostensibly made publishing in externally reviewed journals a priority, possibly because of the presence of the academics within VARC, as well as its affiliation with the University of Wisconsin, Madison. However, very few studies have been published to date about the model and its effectiveness, again likely given its infancy. Instead (like with the EVAAS®), the Center has disproportionally produced and disseminated technical reports, white papers, and presentations, all of which (like with the EVAAS®) seem to also be disseminated for marketing and other informational purposes, including the securing of additional contracts. Unfortunately, a commonality across the two models is that they both seem bent on implementation before validation.

Regardless, VARC defines its methods as “collaborative” given that VARC researchers have worked with school districts, mainly in Milwaukee and Madison, to help them better build and better situate their value-added model within the realities of districts and schools (VARC, 2012c). As well, VARC defines its value-added model as “fair.” What this means remains unclear. Otherwise, and again, little is still known about the VARC model itself, including its strengths and weaknesses.

But I would bet some serious cash the model, like the others, has the same or similar issues as all other VAMs. To review these issues, please click here to (re)read the very first post on VAMboozled! (October 30, 2013), about these general but major issues.

Otherwise, he are some additional specifics:

  • The VARC model uses generally accepted research methods (e.g., hierarchical linear modeling) to purportedly measure and evaluate the contributions that teachers (and schools/districts) make to student learning and achievement over time.
  • VARC compares individual students to students who are like them by adjusting the statistical models using the aforementioned student background factors. Unlike the EVAAS®, however, VARC does make modifications for student background variables that are outside of a teacher’s (or school’s/district’s) direct control.
  • VARC controls include up to approximately 30 variables including the standard race, gender, ethnicity, levels of poverty, students’ levels of English language proficiency, and special education statuses. VARC also uses other variables when available including, for example, student attendance, suspension, retention records and the like. For this, and other reasons, and according to Meyer, this helps to make the VARC model “arguably one of the best in the country in terms of attention to detail.
  • Then (like with the EVAAS®) whether students whose growth scores are aggregated at the teacher (or school/district) levels statistically exceed, meet, or fall below their growth projections (i.e., also above or below one standard deviation from the mean) helps to determine teachers’ (or schools’/districts’) value-added scores and subsequent rankings and categorizations. Again, these are relatively determined depending on where other teachers (or schools/districts) ultimately land, and they are based on the same assumption that effectiveness is the average of the teacher (or school/district) population.
  • Like with the EVAAS®, VARC also does this work with publicly subsidized monies, although, in contrast to SAS®, VARC is a non-profit organization.
  • Given my best estimates, VARC is currently operating 25 projects exceeding a combined $28 million (i.e., $28,607,000) given federal (e.g., from the U.S. Department of Education, Institute for Education Sciences, National Science Foundation), private (e.g., from Battelle for Kids, The Joyce Foundation, The Walton Foundation), and state and district funding.
  • VARC is currently contracting with the state departments of education in Minnesota, New York, North Dakota, South Dakota, and Wisconsin. VARC is also contracting with large school districts in Atlanta, Chicago, Dallas, Fort Lauderdale, Los Angeles, Madison, Milwaukee, Minneapolis, New York City, Tampa/St. Petersburg, and Tulsa.
  • Funding for the 25 projects currently in operation ranges from the lowest, short-termed, and smallest-scale $30,000 project to the highest, longer-termed, and larger-scale $4.2 million project.
  • Across the grants that have been funded, regardless of type, the VARC projects currently in operation are funded at an average of $335,000 per year with an average funding level just under $1.4 million per grant.
  • It is also evident that VARC is expanding its business rapidly across the nation. In 2004 when the center was first established, VARC was working with less than 100,000 students across the country. By 2010 this number increased 16-fold; VARC was then working with data from approximately 1.6 million students in total.
  • VARC delivers sales pitches in similar ways, although those affiliated with VARC do not seem to overstate their advertising claims quite like those affiliated with EVAAS®.
  • Additionally, VARC officials are greatly focused on the use of value-added estimates for data informed decision-making. “All teachers should [emphasis added] be able to deeply understand and discuss the impact of changes in practice and curriculum for themselves and their students.”

An Assistant Principal from Tennessee on the EVAAS System

One of the best parts about this blog, in my humble opinion, is hearing from folks in the field who are living out the realities and consequences of value-added as being implemented across America’s public schools. On that note, if you are a practitioner and ever feel like writing me directly, about your experiences (good or bad) with value-added specifically, please do so.

One Assistant Principal from Tennessee recently wrote me an email, which I asked for his permission to share with all of you thereafter, about the extent to which value-added scores in his state, as derived via the Education Value-Added Assessment System (EVAAS) used throughout Tennessee (and some other states and many other districts), are biased not only by the types of students non-randomly assigned to teachers’ classrooms but also by the types of subject areas taught in certain grade levels.

I have a keen interest in the EVAAS, specifically, as this is the proprietary model I have studied and researched now for almost 10 years (see related studies in the readings section of this blog if you’re so inclined). This is also the value-added system that, let’s say, “inspired” me to devote my scholarly efforts to this topic.

He wrote:

I wanted to share some insights and observations I’ve noticed about TVAAS [the EVAAS is called the Tennessee Value-Added Assessment System in Tennessee] through the years. I’m sure your analysis and deconstruction of our state’s VAM will be much more thorough and mathematically sound than anything I can bring to light [see another two posts forthcoming including our analyses]. That being the case, when there are things that the lay person can notice, I feel them worth sharing [and they are].

Currently I am working as an Assistant Principal in Middle Tennessee. I spent nine years teaching middle school where I was evaluated based on my TVAAS scores.  I usually did quite well on my TVAAS scores but over the years noticed troubling repetitive features of the scores at my school.

What disturbs me most, and I have never seen this addressed, is how subject and grade specific high and low value added scores correlate to specific subjects and grade levels.  For example, in Tennessee 4th and 8th grade ELA [English/language arts] scores consistently have high value added marks while 6th, and 7th grade ELA do poorly. The 5th grade scores are more of a mixed bag. To illustrate this, go to the TVAAS website and look at Shelby’s [Memphis] or Davidson County’s [Nashville] or Williamson’s (Nashville’s most affluent bedroom community) TVAAS scores (Value Added Summary in District Reports).

To me this clearly illustrates that TVAAS is correlated much more to subject and grade level than teacher effect. Were these the results of a few schools you might assume it was a pedagogical issue. When you see this consistent pattern across thirty and forty schools it causes me [now as an Assistant Principal] concern to evaluate teachers with this tool.

The other feature of TVAAS that concerns me is are the violent swings of high school value added math scores compared [to] the relatively subtle gains and losses of ELA. Clicking on any value added summary for any EOC [End of Course] test on the state website should bring up the EOC scores for all subjects. What I see are math scores that frequently exceed (+/-) 20. ELA TVAAS scores seldom exceed (+/-) 6. 

Why would a metric designed to measure [the] teacher effect be so stable in ELA and fluctuate so much in math?

This is precisely one, of the many questions, administrators, policymakers, and members of the public whose monies are going to support this and other systems should be asking. If the companies cannot produce research clearly evidencing that this type of bias is not occurring, they should not get the contracts nor the substantial monies that accompany them.

See our analyses as per this Assistant Principal’s expressed concerns forthcoming in the next two VAMboozled! posts.

The Gates Foundation and its “Strong Arm” Tactics

Following up on VAMboozled!’s most recent post, about the Bill & Melinda Gates Foundation’s $45 million worth of bogus Measures of Effective Teaching (MET) studies that were recently honored with a 2013 Bunkum (i.e., meaningless, irrelevant, junk) Award by the National Education Policy Center (NEPC), it seems that the Bill & Melinda Gates foundation are, once-again, “strong-arming states [and in this case a large city district] into adoption of policies tying teacher evaluation to measures of students’ growth.”

According to Nonprofit Quarterly, the Gates Foundation is now threatening to pull an unprecedented $40 million grant from Pittsburgh’s Public Schools “because the foundation is upset with the lack of an agreement between the school district and the teachers’ union over a core element of the grant” — the use of test scores to measure teachers’ value-added and to “reward exceptional teachers and retrain those who don’t make the grade.”

More specifically, the district and its teachers are not coming to an agreement about how they should be evaluated, rightfully because teachers understand better than most (even some VAM researchers) that these models are grossly imperfect, largely biased by the types of students non-randomly assigned to their classrooms and schools, highly unstable (i.e., grossly fluctuating from one year to the next when they should remain more or less consistent over time, if reliable), invalid (i.e., they do not have face validity in that they often contradict other valid measures of teacher effectiveness), and the like.

It seems, also, that Randi Weingarten, having recently taken a position against VAMs (as posted in VAMboozled! here and here), has also “added value,” at least in terms of the extent to which teachers in Pittsburgh are (rightfully) exercising some more authority and power over the ways in which they are to be (rightfully) evaluated. Unfortunately, however, money talks, and $40 million of it is a lot to give up for a publicly funded district like this one in Pittsburgh.

A Consumer Alert Issued by The 21st Century Principal

In an excellent post just released by The 21st Century Principal the author writes about yet another two companies calculating value-added for school districts, again on the taxpayer’s dime. Teacher Match and Hanover Research are the companies specifically named and targeted for marketing and selling a series of highly false assumptions about teaching and teachers, highly false claims about value-added (without empirical research in support), highly false assertions about how value-added estimates can be used for better teacher evaluation/accountability, and highly false sales pitches about what they as value-added/research “experts” can do to help with the complex statistics needed for the above

The main points of the articles, as I see them, pulled from the main article and in order of priority follow:

  1. School districts are purchasing these “products” based entirely on the promises and related marketing efforts of these (and other) companies. Consumer Alert! Instead of accepting these (and other) companies’ sales pitches and promises that these companies’ “products” will do what they say they will, these companies must be forced to produce independent, peer-reviewed research to prove that what they are selling is in fact real. If they can’t produce the studies, they should not earn the contracts!!
  2. Doing all of this is just another expensive drain on what are already short educational resources. One district is paying over $30,000 to Teacher Match per year for their services, as cited in this piece. Related, the Houston Independent School District is paying SAS Inc. $500,000 per year for their EVAAS-based value-added calculations. These are not trivial expenditures, especially when considering the other potential research-based inititaives towards which these valuable resources could be otherwise spent.
  3. States (and the companies selling their value-added services) haven’t done the validation studies to prove that the value-added scores/estimates are valid. Again, almost always is it that the sales and marketing claims made by these companies are void of evidence that supports the claims being made.
  4. Doing all of this elevates standardized testing even higher in the decision-making and data-driven processes for schools, even though doing this is not warranted or empirically supported (as mentioned).
  5. Related, value-added calculations rely on inexpensive (aka “cheap”) large-scale tests, also of questionable validity, that still are not designed for the purposes for which they are being tasked and used (e.g., measuring growth upwards cannot be done without tests with equivalent scales, which really no tests at this point have).

The shame in all of this, besides the major issues mentioned in the five points above, is that the federal government, thanks to US Secretary of Education Arne Duncan and the Obama administration, is incentivizing these and other companies (e.g. SAS EVAAS, Mathematica) to exist, construct and sell such “products,” and then seek out and compete for these publicly funded and subsidized contracts. We, as taxpayers, are the ones consistently footing the bills.

See another recent article about the chaos a simple error in Mathematica’s code caused in Washington DC’s public schools, following another VAMboozled post about the same topic two weeks ago.


What’s Happening in Tennessee?

VAMs have been used in Tennessee for more than 20 years, coming about largely “by accident.” In the late 1980s, an agricultural statistician/adjunct professor at the University of Knoxville – William Sanders – thought that educators struggling with student achievement in the state could “simply” use more advanced statistics, similar to those used when modeling genetic and reproductive trends among cattle, to measure growth, hold teachers accountable for that growth, and solve the educational measurement woes facing the state at the time.

Sanders developed the Tennessee Value-Added Assessment System (TVAAS), which is now known as the Education Value-Added Assessment System (EVAAS®) and operated by SAS® Institute Inc. Nowadays, the SAS® EVAAS® is widely considered, with over 20 years of development, the largest, one of the best, one of the most widely adopted and used, and likely the most controversial VAM in the country. It is controversial in that it is a proprietary model (i.e., it is costly and used/marketed under exclusive legal rights of the inventors/operators), and it is often akin to a “black box” model (i.e., it is protected by a good deal of secrecy and mystery, largely given its non-transparency). Nonetheless, on the SAS® EVAAS® website developers continue to make grandiose marketing claims without much caution or really any research evidence in support (e.g., using the SAS® EVAAS® will provide “a clear path to achieve the US goal to lead the world in college completion by the year 2020”). Riding on such claims, they continue to sell their model to states (e.g., Tennessee, North Carolina, Ohio, Pennsylvania) and districts (e.g., the Houston Independent School District) regardless, but at taxpayers’ costs.

In fact, thanks to similar marketing efforts and promotions, not to mention lobbying efforts coming out of Tennessee, VAMs have now been forced upon America’s public schools by the Obama administration’s Race to the Top program. Not surprisingly, Tennessee was one of the first states to receive Race to the Top funds to the tune of $502 million, to further advance the SAS® EVAAS® model, still referred to as the TVAAS, however, in the state of Tennessee.

Well, it seems things are not going so well in Tennessee. As highlighted in this recent article, it seems that boards of education across the state are increasingly opposing the TVAAS for high-stakes use (e.g., using TVAAS estimates when issuing, renewing or denying teacher licenses).

Some of the key issues? The TVAAS is too complex to understand; teachers’ scores are highly and unacceptably inconsistent from one year to the next, and, hence, invalid; teachers are being held accountable for things that are out of their control (e.g., what happens outside of school); there are major concerns about the extent to which teachers actually cause changes in TVAAS estimates over time; the state does not have any improvement plans in place for teachers with subpar TVAAS estimates; these issues are being further complicated by changing standards (e.g., the adoption of the Common Core) at the same time; and the like.

Interestingly enough, researchers have been raising precisely these same concerns since the early 1990s when the TVAAS was first implemented in Tennessee. Had policymakers listened to researchers’ research-based concerns, many a taxpayer dollar would have been saved!

New Tests Nothing but New

Across the country, states are bidding farewell to the tests they adopted and implemented as part of No Child Left Behind (NCLB) in 2002 while at the same time “welcoming” (with apprehension and fear) a series of new and improved tests meant to, once again, increase standards and measure the new (and purportedly improved) Common Core State Standards.

I say “once again” in that we have witnessed policy trends similar to these for more than 30 years now—federally-backed policies that are hyper-reliant on increasing standards and holding educators and students accountable for meeting the higher standards with new and improved tests. Yet, while these perpetual policies are forever meant to improve and reform America’s “failing” public schools, we still have the same concerns about the same “failing” public schools despite 30 years of the same/similar attempts to reform them.

The bummer that comes along with being an academic who has dedicated her scholarly life to conducting research in this area is that it can sometimes create a sense of cynicism, or realistic skepticism, that is based on nothing more than history. Studying over three decades of similar policies (i.e., educational policies based on utopian ideals that continuously promote new and improved standards along with new and improved tests to ensure that the higher standards are met) does a realistic skeptic make!

Mark my words! Here is how history is to, once again, repeat itself:

  1. States’ test scores are going to plummet across the nation when students in America’s public school first take the new Common Core tests (e.g., this has already happened in Kentucky, New York, and North Carolina, the first to administer the new tests);
  2. Many (uninformed) policymakers and members of the media are going to point their fingers at America’s public schools for not taking the past 30 years of repeated test-based reforms seriously enough, blaming the teachers, not the new tests, on why students score so low;
  3. The same (uninformed) policymakers and members of the media will say things like “finally those in America’s public schools are going to blatantly see everything that is wrong with what they are (or are not) doing and will finally start taking more seriously their charge to teach students how to achieve higher standards with these better tests in place – see for example U.S. Secretary of Education Arne Duncan’s recent comments, informing U.S. citizens that the low scores should be seen as a new baseline that will incentivize, in and of itself, school reform and improvement;
  4. Test scores will then miraculously begin to rise year-after-year, albeit for only a few years, after which the same (uninformed) policymakers and members of the media will attribute the observed increases in growth to the teachers finally taking things seriously, all the while ignoring (i.e., being uninformed) that the new and improved tests will be changing, behind the scenes, at the same time (e.g., the “cut-scores” defining proficiency will change and the most difficult test items that are always removed after new tests are implemented will be removed, all of which will cause scores to “artificially” increase regardless of what students might be learning);
  5. Educators will help out with this as they too know very well (given over 30 years of experience and training) how to help “artificially inflate” their scores, thinking much of the time that test score boosting practices (e.g., teaching to the test, narrowing the curriculum to focus excessively on the tested content) are generally in students’ best interests; but then…
  6. America’s national test scores (i.e., on the National Assessment of Educational Progress [NAEP]) and America’s international test scores (i.e., on the Trends in International Mathematics and Science Study [TIMSS], Progress in International Reading Literacy Study [PIRLS], and Programme for International Student Assessment [PISA]) won’t change much…what!?!?…then;
  7. Another round of panic will set in, fingers will point at America’s public schools, yet again, and we will, yet again (though hopefully not by the grace of more visionary educational policymakers) look to even higher standards and better tests to adopt, implement, and repeat, from the beginning – see #1 above.

While I am a gambling woman, and I would bet my savings on the historically-rooted predictions above, should anybody want to take that bet, I would walk away from this one altogether in that in the state of Arizona, alone, this “initiative” is set to cost taxpayers more than $22.5 million a year, $9 million more than the state is currently paying for its NCLB testing program (i.e., the Arizona Instrument to Measure Statndards [AIMS]). Now that is a losing bet, based on nothing more than a gambler’s fallacy. That this is going to work, this time, is false!