Jimmy Scherrer – Assistant Professor at North Carolina State University and former teacher and mathematics instructional coach in the Los Angeles Unified School District (LAUSD) – is a rising star in the education academy, in large part due to his educational research on VAMs as well as mathematics, educational policy, and the like. I’ve cited one of the pieces he wrote in 2011 (for a educational practitioner audience) many times as the way he carefully deconstructed some of the assumptions surrounding VAMs and VAM uses within this piece speaks volumes to much of the absurdity surrounding them. See the full PDF of this article here. See also the full reference for this piece: “Measuring Teaching Using Value-Added Modeling: The Imperfect Panacea,” here, in my list of the “Top 25 Research Articles” about VAMs.
Well, Schererrer just published a new article titled, “The Limited Utility of Value-Added Modeling in Education Research and Policy,” and I invited him to write a blog post about this piece for you all here. Scherrer graciously agreed, and wrote the following:
As someone who works with students in poverty [see also a recent article Scherrer wrote in the highly esteemed, peer-reviewed Educational Researcher here], I am deeply troubled by the use of status measures—the raw scores of standardized assessments—for accountability purposes. The relationship between SES and standardized assessment scores is well known. Thus, using status measures for accountability purposes incentivizes teachers to work in the most advantaged schools.
So, I am pleased with the increasing number of accountability systems that are moving away from status measures. In their place, systems seem to be favoring value-added estimates. In theory, this is a significant improvement. However, the manner in which the models are currently being used and how the estimates are currently being interpreted is intellectually criminal. The models’ limitations are obvious. But, as a learning scientist, what’s most alarming is the increasing use of the estimates generated by value-added models as a proxy for “effective” teaching. Here’s why:
Different teaching practices reflect different pedagogical epistemologies. These epistemologies are rooted in various learning perspectives. Different perspectives correspond to different assumptions about how to teach and, ultimately, how to assess. When the education policy community discusses “effectiveness,” the articulation of these different conceptions of learning matter, and considerations of consistency and coherence across learning, teaching, and assessing need to come into the discourse.
Typically, research studies on teaching and learning are framed using one of three perspectives: the behaviorist, the cognitivist, and the situative. Each perspective is associated with a different grain size. The behaviorist perspective focuses on basic skills, such as arithmetic. The cognitivist perspective focuses on conceptual understanding, such as making connections between addition and multiplication. The situative perspective focuses on practices, such as the ability to make and test conjectures. Effective teaching includes providing opportunities for students to strengthen each focus. However, traditional standardized assessments mainly contain questions that are crafted from a behaviorist perspective. The conceptual understanding that is highlighted in the cognitivist perspective and the participation in practices that is highlighted in the situative perspective are not captured on traditional standardized assessments. Thus, the only valid inference that can be made from a value-added estimate is about a teacher’s ability to teach the basic skills and knowledge associated with the behaviorist perspective.
When using assessment data to make an inference about classroom teaching, there needs to be coherence and consistency within a learning perspective. Claims of “effectiveness” can only be made between types of learning and types of teaching that are rooted in the same perspective. The current practice of using value-added estimates as a proxy for effective teaching introduces a “leap” across perspectives. That is, scores from traditional standardized assessments rooted in behaviorism are being used to make inferences about classroom teaching practices that are coherent with different perspectives. This “leap” essentially eliminates the ability to make a connection between high value-added estimates and current notions of effective classroom teaching.
Simply using value-added estimates as a proxy for effective teaching is intellectually lazy. If the education policy community is serious about improving the quality of teaching, then any accountability system must articulate what quality teaching is (not what it produces!). Until then, we all leap at our own risk.
Contact Jimmy Scherrer at firstname.lastname@example.org and/or follow him on Twitter: @jimmyscherrer Thanks Jimmy!
It’s a good point to think about the alignment of standardized tests and current teaching philosophy and methods! Accountability research in education is important and challenging. Any comments about the Quality Counts report by Education Week (http://www.edweek.org/ew/qc/)?
I have only read the “executive summary” of the quality counts report (and then attended a brief talk about it). Thus, I am not sure I can make an intelligent connection to my paper.
In general, what turns me off about reports like that is their practice of “grading.” For example, the “grading summary” of the quality counts report assigns the US a C+ for “Chance of Success.” What does that mean? Sure, we can do a bit of research and track down how they assigned the grade. But why not just share with us the data that they have and let us (the public) assign our own grade? Who privileged Ed Week (and others who do the same thing) to think for us?
I think this practice is very similar to a state department of education assigning your child’s teacher a value-added score of +4. That number is not helpful. Give us a thick description of what’s going on in the classroom and let us (the public) decide if we think it is “above average.”