The Intersection of Standards and their Assessments: From an AZ Teacher

In January, I wrote a post about “An AZ Teacher’s Perspective on Her “Value-Added.” Valerie Strauss covered the same story in her The Answer Sheet blog for The Washington Post, validating for me that readers appreciate stories from the field that explain in better terms than I can what is actually happening as these VAM-based teacher accountability and evaluation systems are being “lived out” in practice.

Well, the same AZ teacher wrote to me another story that I encourage you all to read, about the intersection and alignment of standards and their assessments, or more specifically the lack thereof.

She writes:

A fundamental principal in education is the precise alignment of the teaching of learning objectives (standards) with the assessment of learning objectives (tests). Research has demonstrated that when an educator plans lessons that begin with an analysis of what students need to learn, coupled with how a student will demonstrate the learning, achievement tends to happen. This is a “best practice” in education.

Enter: standards’ reform. My school district saw the writing on the wall: Common Core implementation was going to be massive. Beyond a shift in the philosophical underpinnings of standards (college and career readiness vs. every state for themselves), Common Core implementation meant, in some cases, a shift in instructional approaches (inquiry vs. modeling). And frequently, Common Core implementation meant changes in what got taught and in which grade levels.

Like any well-meaning, responsible school district, my district realized these changes were going to take time.  And so they began Common Core implementation earlier than others in the 2012-2013 school year. And based on what we know about best practices, when the standards change, the assessments should change as well. But they didn’t—yet. For the past two years, I (and many, many others) have been teaching standards that are NOT fully aligned to the state assessment system. Instead, we’ve been frantically (and some may say schizophrenically!) trying to teach two sets of standards—the old (aligned with Arizona’s current state assessment) and the new Common Core.

Enter: value-added measures. Value added measures are statistical tools aimed at capturing a teacher’s impact on student achievement through student performance on standardized tests. A few years ago, Arizona passed a law that mandates up to 50% of a teacher’s evaluation be comprised of student test scores. And again, my school district did what any responsible, law-abiding district would do: implement a teacher evaluation system that complies with state law: 50% of a teacher’s evaluation is composed of student test scores and 50% is composed of classroom observations.

The intersection of these two policies is a problem for teachers (and students!). If the state assessment is not designed to precisely test the standards that are being taught, it cannot be legitimately claimed that value-added measures (or any other measure using student test scores!) are capturing a teacher’s impact on student achievement? One problem for teachers is that their employment status may hinge on the outcome. One problem for students is that what they are learning may not be what they are ultimately held accountable for on the state assessment.

In this case, compliance with law has superseded the use of best practices. Let us hope this doesn’t happen in another field–say, healthcare?

 

States Most Likely to Receive Race to the Top Funds

Since 2009, the US Department of Education via its Race to the Top initiative has given literally billions in federal, taxpayer funds to incentivize states to adopt its various educational policies, as based on many non-research-based or research-informed reforms. As pertinent here, the main “reform” being VAMs, with funding going to states that have them, are willing to adopt them, and are willing to use them for low- and preferably high-stakes decisions about teachers, schools, and districts.

Diane Ravitch recently posted a piece about the Education Law Center finding that there was an interesting pattern to the distribution of Race to the Top grants. The Education Law Center, in an Education Justice article, found that the states and districts with the least fair and equitable state school finance systems were the states that won a large share of RTTT grants.

Interesting, indeed, but not surprising. There is an underlying reason for this, as based on standard correlations anybody can run or calculate using state-level demographics and some basic descriptive statistics.

In this case, correlational analyses reveal that state-level policies that rely at least in part on VAMs are indeed more common in states that allocate less money than the national average for schooling as compared to the nation. More specifically, they are more likely found in states in which yearly per pupil expenditures are lower than the national average (as demonstrated in the aforementioned post). They are more likely found in states in which students perform worse, or have lower relative percentages of “proficient” students as per the US’s (good) national test (i.e., the National Assessment of Educational Progress (NAEP). They are more likely found in states that have more centralized governments, rather than those with more powerful counties and districts as per local control. They are more likely to be found in more highly populated states and states with relatively larger populations of poor and racial and language minority students. And they are more likely to be found in red states in which residents predominantly vote for the Republican Party.

All of these underlying correlations indeed explain why such policies are more popular, and accordingly adopted in certain states versus others. As well, these underlying correlations help to explain the correlation of interest as presented by the Education Law Center in its aforementioned Education Justice article.  Indeed, these states disproportionally received Race to the Top funds as their political and other state-level demographics would have predicted them to, as these are the states most likely to climb on board the VAMwagon (noting that some states had already done so prior to Race to the Top and hence won first-round Race to the Top funds [e.g., Tennessee]).

Please note, however, that with all imperfect correlations found in correlational research, there are outliers. In this case, this would include blue states that adopt VAMs for consequential purposes (e.g., Colorado) or red states who continue to move relatively slower in terms of their VAM-based policies and initiatives (e.g., Texas, Arizona, and Mississippi). (Co)related again, this would also include states with relatively fewer and relatively more poor and minority students and English Language Learners (ELLs), respectively.

For more about these correlations and state level research, please see: Collins, C., & Amrein-Beardsley, A. (2014). Putting growth and value-added models on the map: A national overview. Teachers College Record, 16(1). Retrieved from: http://www.tcrecord.org/Content.asp?ContentId=17291

An Oldie But Still Very Relevant Goodie: The First Documented Value-Added “Smack-Down”

When I first began researching VAMs, and more specifically the Education Value-Added Assessment System (EVAAS) developed by William Sanders in the state of Tennessee (the state we now know as VAM’s “ground zero”), I came across a fabulous online debate (before blogs like this and other social networking sources were really prevalent), that was all about this same system, that was then called the TVAAS (the Tennessee Value-Added Assessment System).

The discussants questioning the TVAAS? Renowned scholars including: Gene Glass — best known for his statistical work and for his development of “meta-analysis;” Michael Scriven — best known for his scholarly work in evaluation; Harvey Goldstein — best known for his knowledge of statistical modeling and their use on tests; Sherman Dorn — best known for his work on educational reforms and how we problematize our schools; Gregory Camilli — best known for his studies on the effects of educational programs and policies; and a few others with whom I am less familiar. The discussants defending their TVAAS? William Sanders — the TVAAS/EVAAS developer; Sandra P. Horn — Sanders’s colleague; and an unknown discussant representing the “TVAAS (Tennessee Value-Added Assessment System.”

While this was what could now easily be called the first value-added “smack-down” (I am honored to say I was part of the second, and the first so titled), it served as a foundational source to the first study I ever published on the topic of VAMs (a study published in 2008 in the highly esteemed Educational Researcher and titled, Methodological concerns about the Education Value-Added Assessment System [EVAAS]). I was just reminded, today, about this online debate (or debate made available online) that, although it took place in 1995, is still one of if not the best in-depth debates surrounding, and thorough analyses of VAM that has ever been done.

While it is long, it is certainly worth a read and review, as readers too should see in this debate so many issues still relevant and currently problematic, now 20 years later.  You can see just how far we’ve really come in the last, now 20 years since this VAM nonsense really got started, as the issues debated here are still, for the most part, the issues that continue to go unresolved…

One of my favorite highlight’s, I’ve pasted here if I have not yet enticed you enough…it comes from a post written by Gene Glass on Friday, October 28th, 1994. Gene writes:

“Dear Professor Sanders:

I like statistics; I made the better part of my living off of it for many years. But could we set it aside for just a minute while you answer a question or two for me?

I gather that [the TVAAS] is a means of measuring what it is that a particular teacher contributes to the basic skills learning of a class of students. Let me stipulate for the moment that for your sake all of the purely statistical considerations attendant to partialling out previous contributions of other teachers’ “additions of value” to this year’s teachers’ addition of value have been resolved perfectly–above reproach; no statistician who understands mixed models, covariance adjustment, and the like would question them. Let’s just pretend that this is true.

Now imagine–and it should be no strain on one’s imagination to do so–that we have Teacher A and Teacher B and each has had the pretest (September) achievement status of their students impeccably measured. But A has a class with average IQ of 115 and B has a class of average IQ 90. Let’s suppose that A and B teach to the very limit of their abilities all year long and that in the eyes of God, they are equally talented teachers. We would surely expect that A’s students will achieve much more on the posttest (June) than B’s. Anyone would assume so; indeed, we would be shocked if it were not so.”

Question: Does your system of measuring and adjusting and assigning numbers to teachers take these circumstances into account so that A and B emerge with equal “added value” ratings?”

Sandra P. Horn’s answer? “Yes.”

Horn, had to say yes in response to Gene’s question, however, or the method would have even then been exposed as entirely invalid. Students with higher levels of intelligence undoubtedly learn more than students with lower levels of intelligence, and if two classes differ greatly on IQ, one will make greater progress during the year. This growth can have nothing to do with the teacher, and this can be (and still is) observed, despite the sophisticated statistical controls meant to control for students’ prior achievements, and in this case their aptitudes.

Teachers’ Licenses No Longer Jeopardized by THE Tennessee Value-Added Assessment System (TVAAS)

On Friday, April 12th, the Tennessee Board of Education met and rescinded much of the policy that ties teachers’ licenses to their value-added, as determined by the growth in their students’ performance on tests over time as calculated by the all-too-familiar Tennessee Education Value-Added Assessment System (TVAAS), and its all-too-familiar mother-ship the Education Value-Added Assessment System (EVAAS). This means that “teachers’ licenses can no longer be jeopardized by the changes in their students’ test.”

What a victory for teachers in Tennessee! Especially given the TVAAS has been in place in the state since 1993, and Tennessee is the state with the longest running history using value-added (thanks to education “value-added” creator William Sanders).

Although, again, it might be too soon to celebrate, as rumors have it that some of the other consequences tied to TVAAS output are still to be pushed forward. Rather, this might be viewed or interpreted as a symbolic gesture or compromise, to open up room for further compromising in the future.

Accordingly, David Sevier, the Tennessee State Board’s Deputy Executive Director also noted that the Board of Education can “now begin to find a measure that is comparable to TVAAS that all parties can agree on.” While I’m not quite sure how to interpret what this means, though, in terms of whether they think it’s just the TVAAS that’s the problem in terms of evaluating their teachers (also for licensure purposes) and that another VAM can satisfy their needs and ideals, or not. Hopefully, the latter is true.

To read more about this from the original article, click here.

My Book on VAMs on Pre-Order, and Available on May 7, 2014

The book I spent all last year writing, titled “Rethinking Value-Added Models in Education: Critical Perspectives on Tests and Assessment-Based Accountability,” with its foreword written by Diane Ravitch (please see its cover below), is to be released by my publisher — Routledge — on May 4, 2014, although it is now available for pre-order.

For those of you who are interested (and who have also inquired about this book’s release), you can (pre)order this book on Amazon, here, for $34.62 (free shipping, of course, on orders over $35); you can find it via Barnes & Noble, here, for $34.42; but you can also go directly to the Routledge site, here, and (pre)order it for $31.96 if you use the following 20% off discount code at checkout: IRK69.

Just so everybody knows, though, I am donating all of my personal royalties to the ACODO Orphanage in Siem Reap, Cambodia. You can find out why I am doing this in the beginning of the book, where I explain how Cambodia relates to my take on VAMs. But on a more general note I have no financial interest in this. While I do care deeply about this topic, as evidenced herein with this blog, and I do feel like as a scholar I am fighting a good fight particularly against those who are not making very informed and/or research-based decisions when it comes to VAMs, I do have absolutely no interest in making any money off of the many (many of which are shameful) consequences coming about as a result of inappropriately attaching high-stakes consequences to, and/or making high-stakes decisions as based on VAMs.

Below are some of the more specific details on the book, in case this information helps you make your decision to purchase, and of course read, and hopefully use widely, and if the situation calls for it, wildly!

Paperback: 256 pages; Chapters: 8 and titled as follows:

  1. Socially Engineering the Road to Utopia
  2. Value-Added Models (VAMs) and the Human Factor
  3. A VAMoramic View of the Nation
  4. Assumptions Used as Rationales and Justifications
  5. Test-Based, Statistical, and Methodological Assumptions
  6. Reliability and Validity
  7. Bias and the Random Assignment of Students into Classrooms
  8. Alternatives, Solutions, and Conclusions

Book Cover: CoverPage

 

 

 

The Center for American “Progress:” Really?!?

The American Statistical Association recently released a position statement on VAMs–a statement that really should/could serve as one of a few “nail in the coffin” reports on VAMs. Others, however, continue to move forward with VAMS despite this position statement and all of its surrounding research.

That being said, the newest member to promote and push VAMs? The Center for American Progress.

As per their website, “The Center for American Progress is an independent nonpartisan educational institute dedicated to improving the lives of Americans through progressive ideas and action. Building on the achievements of progressive pioneers such as Teddy Roosevelt and Martin Luther King, our work addresses 21st-century challenges such as energy, national security, economic growth and opportunity, immigration, education, and health care. We develop new policy ideas, critique the policy that stems from conservative values, challenge the media to cover the issues that truly matter, and shape the national debate.”

Progressive? Pioneering?? Advancing “new” policy ideas??? Not so fast!!

According to a report they just released, here is exactly the progressive and pioneering policy ideas that they are advancing, summarized here specifically in terms of VAMs.

They write: “[W]e know much more about the impact of high-quality teaching on student achievement” citing in their PDF their take on a bunch of “research” studies that, on this and other points throughout the paper, are based almost entirely on the mainly technical reports written by the US Department of Education, William Sanders (the developer of the EVAAS system), Doug Harris (a “cautious” but quite active proponent of VAMs), and Eric Hanushek (whose economic research is almost always cited when policy wonks are interested in advancing VAMs).

Their research is notably a small subset of the actual research out there on VAMs, research that was used to rightfully construct the aforementioned position statement released by the ASA, and research that for decades has evidenced that teachers account for, or can be credited for, approximately 10% of the variance in student test scores, while the other 90% is typically due to factors outside of teachers’ control.

Regardless, while the Center for American Progress briefly acknowledges this, they spin this into their solution: The reason this percentage is so low is because we have not yet been accounting for growth in student achievement over time; that is, via value-added models (VAMs). In other words, using more sophisticated models of measurements (i.e., VAMs) will help to illuminate the “real” results we know are out there, but simply have not been able to capture given our archaic models of measurement and teacher accountability.

Not to worry, though, as they write that these “[n]ew measures of teacher effectiveness, determined by evidence of teacher practice and improvements in student achievement, are now available [emphasis added] and provide strong markers for assessing teaching quality and the equitable distribution of the most capable teachers.”

Yes, they are now available, but they have been both available and in use, particularly in the state of Tennessee, the state in which “the best” VAM has been available and in use since the early 1990s. Yet the state of Tennessee still has not evidenced much of anything, especially as per the intended consequences for which their (best) VAM has been tasked for more than two decades to do, at a hefty price of $1.7 million per year I might add. See a recent summary about this here.

Regardless, and despite the evidence, or lack thereof, “Current federal education policy [also] reflects this new [emphasis added] understanding [given] its accompanying [policy] changes.”

Really? No seriously, really!?!

It goes on: “This is an opportunity to reset the old and align with the new. It is now possible to address concerns about teacher quality in broader, more creative ways that incorporate thoughtful approaches to prepare teachers and school leaders to successfully support learning for all students; hire and recruit the best future educators based on evidence of their performance; reward and retain the best teachers we have in place; create work environments capable of supporting and sustaining a well-prepared and effective teacher workforce; and address the structural causes of inequitable teacher distribution embedded in how we fund and staff our schools. It is time to jettison policies that act as barriers to staffing and compensating the most effective teachers for the most challenging schools and working assignments.”

And it goes on, including recommendations for attaching consequential decisions to VAM estimates…Read more, or not.

Unfortunately, this reminds me of the Saturday Night Live skit, “REALLY?!?” with Seth and Amy that has has repeatedly given viewers a “hilarious, sarcastic look” at some of the biggest issues in American news. Unfortunately, this skit is not nearly as comedic or laughable.

“Devaluing Teachers in the Age of Value-Added” by P. L. Thomas

Just this week, P. L. Thomas, an Associate Professor at Furman University, wrote a very good piece about VAMs, some of their philosophical and historical underpinnings, and some of the recent studies/papers released about VAMs including the recent American Statistical Association (ASA) Position Statement (reviewed in this blog here and just yesterday here), the recent work of Edward Heartel (covered in this blog here, here, and here), some of the wisdom of Rutgers Professor Bruce Baker (mentioned in this blog here and here), and the outstanding, I mean outlandish, Chetty et al. study (covered in this blog here and here). Thanks to my former colleague for sending this along!

Anyhow, here is the direct link to the actual post, although I have also re-posted the initial piece below, especially for those of you reading via your handheld’s and/or emails. Do visit the actual post, though, and perhaps follow his blog, as a quick perusal of the blog’s content will likely evidence that there is some other good content herein.

But for this post, specifically, Thomas writes:

“Devaluing Teachers in the Age of Value-Added”

“We teach the children of the middle class, the wealthy and the poor,” explains Anthony Cody, continuing:

We teach the damaged and disabled, the whole and the gifted. We teach the immigrants and the dispossessed natives, the transients and even the incarcerated.

In years past we formed unions and professional organizations to get fair pay, so women would get the same pay as men. We got due process so we could not be fired at an administrator’s whim. We got pensions so we could retire after many years of service.

But career teachers are not convenient or necessary any more. We cost too much. We expect our hard-won expertise to be recognized with respect and autonomy. We talk back at staff meetings, and object when we are told we must follow mindless scripts, and prepare for tests that have little value to our students.

During the 1980s and 1990s, U.S. public schools and the students they serve felt the weight of standards- and test-based accountability—a bureaucratic process that has wasted huge amounts of tax-payers’ money and incalculable time and energy assigning labels, rankings, and blame. The Reagan-era launching of accountability has lulled the U.S. into a sort of complacency that rests on maintaining a gaze on schools, students, and test data so that no one must look at the true source of educational failure: poverty and social inequity, including the lingering corrosive influences of racism, classism, and sexism.

The George W. Bush and Barack Obama eras—resting on intensified commitments to accountability such as No Child Left Behind (NCLB) and Race to the Top (RTTT)—have continued that misguided gaze and battering, but during the past decade-plus, teachers have been added to the agenda.

As Cody notes above, however, simultaneously political leaders, the media, and the public claim that teachers are the most valuable part of any student’s learning (a factually untrue claim), but that high-poverty and minority students can be taught by those without any degree or experience in education (Teach for America) and that career teachers no longer deserve their profession—no tenure, no professional wages, no autonomy, no voice in what or how they teach.

And while the media and political leaders maintain these contradictory narratives and support these contradictory policies, value-added methods (VAM) of evaluating and compensating U.S. public teachers are being adopted, again simultaneously, as the research base repeatedly reveals that VAM is yet another flawed use of high-stake accountability and testing.

When Raj Chetty, John N. Friedman, and Jonah E. Rockoff released (and re-released) reports claiming that teacher quality equates to significant earning power for students, the media and political leaders tripped over themselves to cite (and cite) those reports.

What do we know about the Chetty, et al., assertions?

From 2012:

[T]hose using the results of this paper to argue forcefully for specific policies are drawing unsupported conclusions from otherwise very important empirical findings. (Di Carlo)

These are interesting findings. It’s a really cool academic study. It’s a freakin’ amazing data set! But these findings cannot be immediately translated into what the headlines have suggested – that immediate use of value-added metrics to reshape the teacher workforce can lift the economy, and increase wages across the board! The headlines and media spin have been dreadfully overstated and deceptive. Other headlines and editorial commentary has been simply ignorant and irresponsible. (No Mr. Moran, this one study did not, does not, cannot negate  the vast array of concerns that have been raised about using value-added estimates as blunt, heavily weighted instruments in personnel policy in school systems.) (Baker)

And now, a thorough review concludes:

Can the quality of teachers be measured the way that a person’s weight or height is measured? Some economists have tried, but the “value-added” they have attempted to measure has proven elusive. The results have not been consistent over tests or over time. Nevertheless, a two-part report by Raj Chetty and his colleagues claims that higher value-added scores for teachers lead to greater economic success for their students later in life. This review of the methods of Chetty et al. focuses on their most important result: that teacher value-added affects income in adulthood. Five key problems with the research emerge. First, their own results show that the calculation of teacher value-added is unreliable. Second, their own research also generated a result that contradicts their main claim—but the report pushed that inconvenient result aside. Third, the trumpeted result is based on an erroneous calculation. Fourth, the report incorrectly assumes that the (miscalculated) result holds across students’ lifetimes despite the authors’ own research indicating otherwise. Fifth, the report cites studies as support for the authors’ methodology, even though they don’t provide that support. Despite widespread references to this study in policy circles, the shortcomings and shaky extrapolations make this report misleading and unreliable for determining educational policy.

Similar to the findings in Edward H. Haertel’s analysis of VAM, Reliability and validity of inferences about teachers based on student test scores (ETS, 2013), the American Statistical Association has issued ASA Statement on Using Value-Added Models for Educational Assessment, emphasizing:

Research on VAMs has been fairly consistent that aspects of educational effectiveness that are measurable and within teacher control represent a small part of the total variation in student test scores or growth; most estimates in the literature attribute between 1% and 14% of the total variability to teachers. This is not saying that teachers have little effect on students, but that variation among teachers accounts for a small part of the variation in scores. The majority of the variation in test scores is attributable to factors outside of the teacher’s control such as student and family background, poverty, curriculum, and unmeasured influences.

The VAM scores themselves have large standard errors, even when calculated using several years of data. These large standard errors make rankings unstable, even under the best scenarios for modeling. Combining VAMs across multiple years decreases the standard error of VAM scores. Multiple years of data, however, do not help problems caused when a model systematically undervalues teachers who work in specific contexts or with specific types of students, since that systematic undervaluation would be present in every year of data.

Among DiCarlo, Baker, Haertel and the ASA, several key patterns emerge regarding VAM: (1) VAM remains an experimental statistical model, (2) VAM is unstable and significantly impacted by factors beyond a teacher’s control and beyond the scope of that statistical model to control, and (3) implementing VAM in high-stakes policies exaggerates the flaws of VAM.

The rhetoric about valuing teachers rings hollow more and more as teaching continues to be dismantled and teachers continue to be devalued by misguided commitments to VAM and other efforts to reduce teaching to a service industry.

VAM as reform policy, like NCLB, is sham-science being used to serve a corporate need for cheap and interchangeable labor. VAM, ironically, proves that evidence does not matter in education policy.

Like all workers in the U.S., we simply do not value teachers.

Political leaders, the media, and the public call for more tests for schools, teachers, and students, but they continue to fail themselves to acknowledge the mounting evidence against test-based accountability.

And thus, we don’t need numbers to prove what Cody states directly: “But career teachers are not convenient or necessary any more.”

The Washington Post also Captures the ASA’s Recent Position Statement on VAMs

Yesterday I released the second of two posts about the recent release of the American Statistical Association’s Position Statement about using VAMs for educational assessment and accountability. Below is a third post, as again warranted, given the considerable significance of this statement.

This one comes from The Washington Post – The Answer Sheet by Valerie Strauss and is pasted here, almost in full.

Strauss writes: “You can be certain that members of the American Statistical Association, the largest organization in the United States representing statisticians and related professionals, know a thing or two about data and measurement. That makes the statement that the association just issued very important for school reform.

The ASA just slammed the high-stakes “value-added method” (VAM) of evaluating teachers that has been increasingly embraced in states as part of school-reform efforts. VAM purports to be able to take student standardized test scores and measure the “value” a teacher adds to student learning through complicated formulas that can supposedly factor out all of the other influences and emerge with a valid assessment of how effective a particular teacher has been.

These formulas can’t actually do this with sufficient reliability and validity, but school reformers have pushed this approach and now most states use VAM as part of teacher evaluations. Because math and English test scores are available, reformers have devised bizarre implementation methods in which teachers are assessed on the test scores of students they don’t have or subjects they don’t teach. When Michelle Rhee was chancellor of D.C. public schools (2007-10), she was so enamored with using student test scores to evaluate adults that she implemented a system in which all adults in a school building, including the custodians, were in part evaluated by test scores.

Assessment experts have been saying for years that this is an unfair way to evaluate anybody, especially for high-stakes purposes such as pay, employment status, tenure or even the very survival of a school. But reformers went ahead anyway on the advice of some economists who have embraced the method (though many other economists have panned it). Now the statisticians have come out with recommendations for the use of VAM for teachers, principals and schools that school reformers should — but most likely won’t — take to heart…”

“…Some economists have gone so far as to say that higher VAM scores for teachers lead to more economic success for their students later in life. Work published by the National Bureau of Economic Research, done by authors Raj Chetty, John N. Friedman and Jonah E. Rockoff [the source of a prior VAMboozled! post here], has made that claim, though there are some big problems with their research, according to an analysis of their latest study published [titled “Lost of Impact, Little Value”] by the National Education Policy Center at the University of Colorado Boulder [also the source of a VAMboozled! post soon to be released]. The analysis finds a number of key problems with the report making the link between VAM of teachers and financial success of students, including the fact that their own results show that VAM calculation for teachers is unreliable…”

“…The evidence against VAM is at this point overwhelming. The refusal of school reformers to acknowledge it is outrageous.”

 

American Statistical Association (ASA) Position Statement on VAMs

Inside my most recent post, about the Top 14 research-based articles about VAMs, there was a great research-based statement that was released just last week by the American Statistical Association (ASA), titled the “ASA Statement on Using Value-Added Models for Educational Assessment.”

It is short, accessible, easy to understand, and hard to dispute, so I wanted to be sure nobody missed it as this is certainly a must read for all of you following this blog, not to mention everybody else dealing/working with VAMs and their related educational policies. Likewise, this represents the current, research-based evidence and thinking of probably 90% of the educational researchers and econometricians (still) conducting research in this area.

Again, the ASA is the best statistical organization in the U.S. and likely one of if not the best statistical associations in the world. Some of the most important parts of their statement, taken directly from their full statement as I see them, follow:

  1. VAMs are complex statistical models, and high-level statistical expertise is needed to
    develop the models and [emphasis added] interpret their results.
  2. Estimates from VAMs should always be accompanied by measures of precision and a discussion of the assumptions and possible limitations of the model. These limitations are particularly relevant if VAMs are used for high-stakes purposes.
  3. VAMs are generally based on standardized test scores, and do not directly measure
    potential teacher contributions toward other student outcomes.
  4. VAMs typically measure correlation, not causation: Effects – positive or negative –
    attributed to a teacher may actually be caused by other factors that are not captured in the model.
  5. Under some conditions, VAM scores and rankings can change substantially when a
    different model or test is used, and a thorough analysis should be undertaken to
    evaluate the sensitivity of estimates to different models.
  6. VAMs should be viewed within the context of quality improvement, which distinguishes aspects of quality that can be attributed to the system from those that can be attributed to individual teachers, teacher preparation programs, or schools.
  7. Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores, and that the majority of opportunities for quality improvement are found in the system-level conditions. Ranking teachers by their VAM scores can have unintended consequences that reduce quality.
  8. Attaching too much importance to a single item of quantitative information is counter-productive—in fact, it can be detrimental to the goal of improving quality.
  9. When used appropriately, VAMs may provide quantitative information that is relevant for improving education processes…[but only if used for descriptive/description purposes]. Otherwise, using VAM scores to improve education requires that they provide meaningful information about a teacher’s ability to promote student learning…[and they just do not do this at this point, as there is no research evidence to support this ideal].
  10. A decision to use VAMs for teacher evaluations might change the way the tests are viewed and lead to changes in the school environment. For example, more classroom time might be spent on test preparation and on specific content from the test at the exclusion of content that may lead to better long-term learning gains or motivation for students. Certain schools may be hard to staff if there is a perception that it is harder for teachers to achieve good VAM scores when working in them. Overreliance on VAM scores may foster a competitive environment, discouraging collaboration and efforts to improve the educational system as a whole.

Also important to point out is that included in the report the ASA makes recommendations regarding the “key questions states and districts [yes, practitioners!] should address regarding the use of any type of VAM.” These include, although they are not limited to questions about reliability (consistency), validity, the tests on which VAM estimates are based, and the major statistical errors that always accompany VAM estimates, but are often buried and often not reported with results (i.e., in terms of confidence
intervals or standard errors).

Also important is the purpose for ASA’s statement, as written by them: “As the largest organization in the United States representing statisticians and related professionals, the American Statistical Association (ASA) is making this statement to provide guidance, given current knowledge and experience, as to what can and cannot reasonably be expected from the use of VAMs. This statement focuses on the use of VAMs for assessing teachers’ performance but the issues discussed here also apply to their use for school or principal accountability. The statement is not intended to be prescriptive. Rather, it is intended to enhance general understanding of the strengths and limitations of the results generated by VAMs and thereby encourage the informed use of these results.”

Do give the position statement a read and use it as needed!

Correction: Make the “Top 13” VAM Articles the “Top 14”

As per my most recent post earlier today, about the Top 13 research-based articles about VAMs, low and behold another great research-based statement was just this week released by the American Statistical Association (ASA), titled the “ASA Statement on Using Value-Added Models for Educational Assessment.”

So, let’s make the Top 13 the Top 14 and call it a day. I say “day” deliberately; this is such a hot and controversial topic it is often hard to keep up with the literature in this area, on literally a daily basis.

As per this outstanding statement released by the ASA – the best statistical organization in the U.S. and one of if not the best statistical associations in the world – some of the most important parts of their statement, taken directly from their full statement as I see them, follow:

  1. VAMs are complex statistical models, and high-level statistical expertise is needed to
    develop the models and [emphasis added] interpret their results.
  2. Estimates from VAMs should always be accompanied by measures of precision and a discussion of the assumptions and possible limitations of the model. These limitations are particularly relevant if VAMs are used for high-stakes purposes.
  3. VAMs are generally based on standardized test scores, and do not directly measure
    potential teacher contributions toward other student outcomes.
  4. VAMs typically measure correlation, not causation: Effects – positive or negative –
    attributed to a teacher may actually be caused by other factors that are not captured in the model.
  5. Under some conditions, VAM scores and rankings can change substantially when a
    different model or test is used, and a thorough analysis should be undertaken to
    evaluate the sensitivity of estimates to different models.
  6. VAMs should be viewed within the context of quality improvement, which distinguishes aspects of quality that can be attributed to the system from those that can be attributed to individual teachers, teacher preparation programs, or schools.
  7. Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores, and that the majority of opportunities for quality improvement are found in the system-level conditions. Ranking teachers by their VAM scores can have unintended consequences that reduce quality.
  8. Attaching too much importance to a single item of quantitative information is counter-productive—in fact, it can be detrimental to the goal of improving quality.
  9. When used appropriately, VAMs may provide quantitative information that is relevant for improving education processes…[but only if used for descriptive/description purposes]. Otherwise, using VAM scores to improve education requires that they provide meaningful information about a teacher’s ability to promote student learning…[and they just do not do this at this point, as there is no research evidence to support this ideal].
  10. A decision to use VAMs for teacher evaluations might change the way the tests are viewed and lead to changes in the school environment. For example, more classr
    oom time might be spent on test preparation and on specific content from the test at the exclusion of content that may lead to better long-term learning gains or motivation for students. Certain schools may be hard to staff if there is a perception that it is harder for teachers to achieve good VAM scores when working in them. Overreliance on VAM scores may foster a competitive environment, discouraging collaboration and efforts to improve the educational system as a whole.

Also important to point out is that included in the report the ASA makes recommendations regarding the “key questions states and districts [yes, practitioners!] should address regarding the use of any type of VAM.” These include, although they are not limited to questions about reliability (consistency), validity, the tests on which VAM estimates are based, and the major statistical errors that always accompany VAM estimates, but are often buried and often not reported with results (i.e., in terms of confidence
intervals or standard errors).

Also important is the purpose for ASA’s statement, as written by them: “As the largest organization in the United States representing statisticians and related professionals, the American Statistical Association (ASA) is making this statement to provide guidance, given current knowledge and experience, as to what can and cannot reasonably be expected from the use of VAMs. This statement focuses on the use of VAMs for assessing teachers’ performance but the issues discussed here also apply to their use for school or principal accountability. The statement is not intended to be prescriptive. Rather, it is intended to enhance general understanding of the strengths and limitations of the results generated by VAMs and thereby encourage the informed use of these results.”

If you’re going to choose one article to read and review, this week or this month, and one that is thorough and to the key points, this is the one I recommend you read…at least for now!