It’s been a while! Thanks to the passage of the Every Student Succeeds Act (ESSA; see prior posts about ESSA here, here, and here), the chaos surrounding states’ teacher evaluation systems has exponentially declined. Hence, my posts have declined in effect. As I have written prior, this is good news!
However, there seems to be a new form of test-based accountability on the rise. Some states are now being pressed to move forward with school letter grade policies, also known as A-F policies that help states define and then label school quality, in order to better hold schools and school districts accountable for their students’ test scores. These reform-based policies are being pushed by what was formerly known as the Foundation for Excellence in Education, that was launched while Jeb Bush was Florida’s governor, and what has since been rebranded as ExcelinEd. With Jeb Bush still in ExcelinEd’s Presidential seat, the organization describes itself as a “501(c)(3) nonprofit organization focused on state education reform” that operates on approximately $12 million per year of donations from the Bill & Melinda Gates Foundation, Michael Bloomberg Philanthropies, the Walton Family Foundation, and the Pearson, McGraw-Hill, Northwest Evaluation Association, ACT, College Board, and Educational Testing Service (ETS) testing corporations, among others.
I happened to be on a technical advisory committee for the state of Arizona, advising the state board of education on its A-F policies, when I came to really understand all that was at play, including the politics at play. Because of this role, though, I decided to examine, with two PhD students — Tray Geiger and Kevin Winn — what was just put out via an American Educational Research Association (AERA) press release. Our study, titled “States’ Performance on NAEP Mathematics and Reading Exams After the Implementation of School Letter Grades” is currently under review for publication, but below are some of the important highlights as also highlighted by AERA. These highlights are especially critical for states currently or considering using A-F policies to also hold schools and school districts accountable for their students’ achievement, especially given these policies clearly (as per the evidence) do not work for that which they are intended.
More specifically, 13 states currently use a school letter grade accountability system, with Florida being the first to implement a school letter grade policy in 1998. The other 12 states, and their years of implementation are Alabama (2013), Arkansas (2012), Arizona (2010), Indiana (2011), Mississippi (2012), New Mexico (2012), North Carolina (2013), Ohio (2014), Oklahoma (2011), Texas (2015), Utah (2013), and West Virginia (2015). These 13 states have fared no better or worse than other states in terms of increasing student achievement on the National Assessment of Educational Progress (NAEP) – the nation’s report card, which is also widely considered the nation’s “best” test – post policy implementation. Put differently, we found mixed results as to whether there was a clear, causal relationship between implementation of an A-F accountability system and increased student achievement. There was no consistent positive or negative relationship between policy implementation and NAEP scores on grade 4 and grade 8 mathematics and reading.
- For NAEP grade 4 mathematics exams, five of the 13 states (38.5 percent) had net score increases after their A-F systems were implemented; seven states (53.8 percent) had net score decreases after A-F implementation; and one state (7.7 percent) demonstrated no change.
- Compared to the national average on grade 4 mathematics scores, eight of the 13 states (61.5 percent) demonstrated growth over time greater than that of the national average; three (23.1 percent) demonstrated less growth; and two states (15.4 percent) had comparable growth.
- For grade 8 mathematics exams, five of the 13 states (38.5 percent) had net score increases after their A-F systems were implemented, yet eight states (61.5 percent) had net score decreases after A-F implementation.
- Grade 8 mathematics growth compared to the national average varied more than that of grade 4 mathematics. Six of the 13 states (46.2 percent) demonstrated greater growth over time compared to that of the national average; six other states (46.2 percent) demonstrated less growth; and one state (7.7 percent) had comparable growth.
- For grade 4 reading exams, eight of the 13 states (61.5 percent) had net score increases after A-F implementation; three states (23.1 percent) demonstrated net score decreases; and two states (15.4 percent) showed no change.
- Grade 4 reading evidenced a pattern similar to that of grade 4 mathematics in that eight of the 13 states (61.5 percent) had greater growth over time compared to the national average, while five of the 13 states (38.5 percent) had less growth.
- For grade 8 reading, eight states (61.5 percent) had net score increases after their A-F systems were implemented; two states (15.4 percent) had net score decreases; and three states (23.1 percent) showed no change.
- In grade 8 reading, states evidenced a pattern similar to that of grade 8 mathematics in that the majority of states demonstrated less growth compared to the nation’s average growth. Five of 13 states (38.5 percent) had greater growth over time compared to the national average, while six states (46.2 percent) had less growth, and two states (15.4 percent) exhibited comparable growth.
In sum, the NAEP data slightly favored A-F states on grade 4 mathematics and grade 4 reading; half of the states increased and half of the states decreased in achievement post A-F implementation on grade 8 mathematics; and a plurality of states decreased in achievement post A-F implementation on grade 8 reading. See more study details and results here.
In reality, how these states performed post-implementation is not much different from random, or a flip of the coin. As such, these results should speak directly to other states already, or considering, investing human and financial resources in such state-level, test-based accountability policies.
In reality, how these states performed post-implementation is not much different from random, or a flip of the coin.
But the system of A-F ratings produced collateral damage well beyond the ups or downs of test scores. Ohio’s system produces statistical fictions.