One of my prior posts was about the peer-reviewed journal Educational Researcher (ER)’s “Special Issue” on VAMs and the commentary titled “Can Value-Added Add Value to Teacher Evaluation?” contributed to the “Special Issue” by Linda Darling-Hammond – Professor of Education, Emeritus, at Stanford University.
In this post, I noted that Darling-Hammond “added” a lot of “value” in one particular section of her commentary, in which she offerec a very sound set of solutions, using VAMs for teacher evaluations or not. Given it’s rare in this area of research to focus on actual solutions, and this section is a must read, I paste this small section here for you all to read (and bookmark, especially if you are currently grappling with how to develop good evaluation systems that must meet external mandates, requiring VAMs).
Here is Darling-Hammond’s “Modest Proposal” (p. 135-136):
What if, instead of insisting on the high-stakes use of a single approach to VAM as a significant percentage of teachers’ ratings, policymakers were to acknowledge the limitations that have been identified and allow educators to develop more thoughtful
approaches to examining student learning in teacher evaluation? This might include sharing with practitioners honest information about imprecision and instability of the measures they receive, with instructions to use them cautiously, along with other evidence that can help paint a more complete picture of how students are learning in a teacher’s classroom. An appropriate warning might alert educators to the fact that VAM ratings
based on state tests are more likely to be informative for students already at grade level, and least likely to display the gains of students who are above or below grade level in their knowledge and skills. For these students, other measures will be needed.
What if teachers could create a collection of evidence about their students’ learning that is appropriate for the curriculum and students being taught and targeted to goals the teacher is pursuing for improvement? In a given year, one teacher’s evidence set might include gains on the vertically scaled Developmental Reading Assessment she administers to students, plus gains on the English language proficiency test for new English learners,
and rubric scores on the beginning and end of the year essays her grade level team assigns and collectively scores.
Another teacher’s evidence set might include the results of the AP test in Calculus with a pretest on key concepts in the course, plus pre- and posttests on a unit regarding the theory of limits which he aimed to improve this year, plus evidence from students’ mathematics projects using trigonometry to estimate the distance of a major landmark from their home. VAM ratings from a state test might be included when appropriate, but they would not stand alone as though they offered incontrovertible evidence about teacher effectiveness.
Evaluation ratings would combine the evidence from multiple sources in a judgment model, as Massachusetts’ plan does, using a matrix to combine and evaluate several pieces of student learning data, and then integrate that rating with those from observations and professional contributions. Teachers receive low or high ratings when multiple indicators point in the same direction. Rather than merely tallying up disparate percentages and urging administrators to align their observations with inscrutable VAM scores, this approach would identify teachers who warrant intervention while enabling pedagogical discussions among teachers and evaluators based on evidence that connects what teachers do with how their students learn. A number of studies suggest that teachers become more effective as they receive feedback from standards-based observations and as they develop ways to evaluate their students’ learning in relation to their practice (Darling-Hammond, 2013).
If the objective is not just to rank teachers and slice off those at the bottom, irrespective of accuracy, but instead to support improvement while providing evidence needed for action, this modest proposal suggests we might make more headway by allowing educators to design systems that truly add value to their knowledge of how students are learning in relation to how teachers are teaching.
*****
If interested, see the Review of Article #1 – the introduction to the special issue here; see the Review of Article #2 – on VAMs’ measurement errors, issues with retroactive revisions, and (more) problems with using standardized tests in VAMs here; see the Review of Article #3 – on VAMs’ potentials here; see the Review of Article #4 – on observational systems’ potentials here; see the Review of Article #5 – on teachers’ perceptions of observations and student growth here; see the Review of Article (Essay) #6 – on VAMs as tools for “egg-crate” schools here; and see the Review of Article (Commentary) #7 – on VAMs situated in their appropriate ecologies here; and see the Review of Article #8, Part I – on a more research-based assessment of VAMs’ potentials here.
Article #8, Part II Reference: Darling-Hammond, L. (2015). Can value-added add value to teacher evaluation? Educational Researcher, 44(2), 132-137. doi:10.3102/0013189X15575346
I am amazed at the absolute reliance in this and so many efforts to address accountability on examples of content from math and from reading, as if there were not different issues when you are teaching other subjects and units of study that are discontinuous with an organizing name for the whole course so general it is not much more than a mailbox. For example, it is not unusual for a general exploratory course in the visual arts for middle school to have several units with some opportunities for choices of projects– drawing and painting, graphic design, architecture, sculpture, photography and so on. In the same district, a different teacher may have a combination of theme based units for individual exploration with choices in media and also a choice for collaborative projects. Both teachers introduce the students to art history, ideas about looking and thinking about works of art, some technical vocabulary but in no sense are these units conceived around the idea that every student will master all of the information in circulation.
Pre and post tests can be forced into these curriculum structures, as in the writing exercises called student learning objectives, but to what end? The teachers and students and sometimes groups of students can be required to document learning, and to meet standards, but the achievements of students across these exploratory projects will, in the end, reflect different affinities, prior levels of instruction, constraints of all kinds in schedules, class composition, resources. and so on.
The tail is wagging the dog when the demand for accountability is focused on ideas about alignments, mastery, standards, improvements, “effectiveness” without some serious considerations of the teaching context and the full spectrum of content and forms of evaluation appropriate to a full spectrum and multifaceted educational program, including the aims of instruction at various levels. And context matters. If your classes meet every other week on an A and B week schedule, what counts as improvement in practice is different from daily instruction.