On Thursday (November 14th), I wrote about what is happening in the courts of Los Angeles about the LA Times’ controversial open public records request soliciting the student test scores of all Los Angeles Unified School District (LAUSD) teachers (see LA Times up to the Same Old Antics). Today I write about the same thing happening in the courts of Florida as per The Florida Times-Union’s suit against the state (see Florida teacher value-added data is public record, appeal court rules).
As the headline reads, “Florida’s controversial value-added teacher data are [to be] public record” and released to The Times-Union for the same reasons and purposes they are being released, again, to the LA times. These (in many ways right) reasons include: (1) the data should not be exempt from public inspection, (2) these records, because they are public, should be open for public consumption, (3) parents as members of the public and direct consumers of these data should be granted access, and the list goes on.
What is in too many ways wrong, however, is that while the court wrote that the data are “only one part of a larger spectrum of criteria by which a public school teacher is evaluated,” the data will be consumed by a highly assuming, highly unaware, and highly uninformed public as the only source of data that count.
Worse, because the data are derived via complex analyses of “objective” test scores that yield (purportedly) hard numbers from which teacher-level value-added can be calculated using even more complex statistics, the public will trust the statisticians behind the scenes, because they are smart, and they will consume the numbers as true and valid because smart people constructed them.
The key here, though, is that they are, in fact, constructed. In every step of the process of constructing the value-added data, there are major issues that arise and major decisions that are made. Both cause major imperfections in the data that will in the end come out clean (i.e., as numbers), even though they are still super dirty on the inside.
As written by The Times-Union reporter, “Value-added is the difference between the learning growth a student makes in a teacher’s class and the statistically predicted learning growth the student should have earned based on previous performance.” It is just not as simple and as straightforward as that.
Here are just some reasons why: (1) large-scale standardized achievement tests offer very narrow measures of what students have achieved, although they are assumed to measure the breadth and depth of student learning covering an entire year; (2) the 40 to 50 total tested items do not represent the hundreds of more complex items we value more; (3) test developers use statistical tools to remove the items that too many students answered correctly making much of what we value not at all valued on the tests; (4) calculating value-added, or growth “upwards” over time requires that the scales used to measure growth from one year to the next are on scales of equal units, but this is (to my knowledge) never the case; (5) otherwise, the standard statistical response is to norm the test scores before using them, but this then means that for every winner there must be a loser, or in this case that as some teachers get better, other teachers must get worse, which does not reflect reality; (6) then value-added estimates do not indicate whether a teacher is good or highly effective, as is also often assumed, but rather whether a teacher is purportedly better or worse than other teachers to whom they are compared who teach entirely different sets of students who are not randomly assigned to their classrooms but for whom they are to demonstrate “similar” levels of growth; (7) then teachers assigned more “difficult-to-teach” students are held accountable for demonstrating similar growth regardless of the numbers of, for example, English Language Learners (ELLs) or special education students in their classes (although statisticians argue they can “control for this” despite recent research evidence); (8) which becomes more of a biasing issue when statisticians cannot effectively control for what happens (or what does not happen) over the summers whereby in every current state-level value-added system these tests are administered annually, from the spring of year X to the spring of year Y, always capturing the summer months in the post-test scores and biasing scores dramatically in one way or another based on that which is entirely out of the control of the school; (9) this forces the value-added statisticians to statistically assume that summer learning growth and decay matters the same for all students despite the research-based fact that different types of students lose or gain variable levels of knowledge over the summer months; and (10) goes into (11), (12), and so forth.
But the numbers do not reflect any of this, now do they.
Another factor not mentioned in the excellent analysis above is that teachers are not yet evaluated on individual VAM but rather by an average of the VAM of teachers of certain subjects. E.g. I teach art, which is not assessed on the FCAT, so my VAM is generated by an average of all the READING teachers, who teach a subject unrelated to my subject area, same for the music, PE and social studies teachers. As far as I know, this will continue for the new test in 2015, whatever that is. I have the blessing of working with an amazing group of colleagues to knock themselves out for our students every day. However, I do work at a Title One school, with 95% free and reduced lunch with all the challenges that come with a high levels of poverty.