The Passing of Maxine Greene – Teachers College, Columbia

6016-02aIt saddens me to announce, for those of you who do not already know, that the wonderful scholar and person, Maxine Greene, passed away this past Thursday (March 29, 2014) due to pneumonia. Maxine was Professor Emeritus and the Founder and Director of the Center for Social Imagination, the Arts and Education at Teachers College, Columbia University, New York. She was also past president of the Philosophy of Education Society, the American Educational Studies Association, and the American Educational Research Association, as well as a member of the National Academy of Education and the recipient of nine honorary doctoral degrees. She died at the age of 96.

To view a short video of an interview I conducted with her three years ago at her home in Manhattan, please see this three minute YouTube clip of the interview highlights here. To view more, including the full interview, her photo gallery, reflections of her family and friends, etc. please visit her page on the the Inside the Academy website here.

Her most poignant quote from these interviews, as they related to the purposes of this blog? “I’m not the kind of teacher who wants to impose an authority on people. I suppose I’ll never stop trying to wake people up to ask questions and have passion about how they look at the world.”

She will be greatly missed by all!

Tennessee Commissioner Huffman’s Machiavellian Methods?

Following up on two of my most recent posts, the first about Commissioner Huffman’s (un)inspiring TEDxNashville talk in which he vociferously celebrated Tennessee students’ recent (albeit highly questionable) gains on the National Assessment of Education Progress (NAEP) scores, and the second about Huffman’s (and the Tennessee Department of Education’s) unexpected postponement of the release of its state-level (TCAP) standardized test scores — test scores that were, by law, to account for 15 to 25 percent of Tennessee students’ final grades — it seems a few more “behind the scenes” details surrounding what is going on in Tennessee might also explain the state’s current situation (see other “behind the scenes” details in the first aforementioned post).

We also now know that in Tennessee, on the grade 12 NAEP the state ranks among the lowest in the nation. We also now know that in Tennessee, before the 4th grade NAEP tests were taken, the state withheld an inordinate proportion of (low-scoring) students in the 3rd grade which (likely) caused (or at the very least helped to produce) the (purportedly) artificial gains observed in grade 4. Tennessee is not the first to have done this, however (see Boston College Professor Walter Haney’s article about the “Texas Miracle” that also occurred on then Governor George W. Bush’s watch here).

These “behind the scenes” explanations, unfortunately for Huffman, explain more of the gains than the casual observer might realize, although Huffman is likely counting on only casual observations being made, as it is this “unassailable evidence [emphasis added]…that [should] carry the day.” Right?!?

Now, a (perhaps reasonable) conspiracy theory has also emerged. As per this blog post, a multitude of (perhaps reasonable) reasons are offered that should “raise even more questions about the motives and reasons behind the non-delivery of [the state’s] TCAP scores to schools across the state.” Possible alternative explanations, besides the data just not being ready, given the state was trying to “narrow” and better align its tests to the forthcoming Common Core (and its “post-equating” methods took longer than anticipated), include the following…as written by the author of this same post:

  • “Narrowing” sounds like “erasing” certain questions and “post-equating” sounds like their version of “post-dating” a check. Taken together they sound very much like cooking the numbers. By eliminating certain questions after the fact, Huffman & Co. will be able to take out those questions where students consistently got wrong answers, thereby lifting the student’s overall score. The state asserts they made the decision to do this before this week, but they conveniently leave out WHO made the decision and WHEN the decision was made. They also do not tell the public WHO will be making the decisions to “narrow” (i.e., take out) certain questions. Who decides? Pearson consultants? Huffman lackey’s?  Who?”

As noted in the second of the aforementioned VAMboozled! posts, “the state made these changes without keeping districts informed about the changes that were being made.” In addition, they did this without the teams of folks (e.g., practitioners) who are or who really should always be involved in such processes, in particular to make sure that at the very least the content of the test items is still (and I use this term loosely) valid. So why would these folks not be involved? Perhaps because it is TRUE that Huffman et al. could indeed significantly skew his state’s TCAP test scores upwards by tossing out questions that did not represent Tennessee’s purported and widely celebrated “progress” in as positive of a light. While in a related op-ed piece, “the state” mentions they have a TAC [Technical Advisory Committee] on board, which is common, there is no mention regarding who comprises this group or whether they were involved with the delayed release or “the state’s” reasoning behind it. This, too, is essential information (suspiciously) not provided.

Regardless, this type of Machiavellian method works quite well, and has proven itself worthy for more than 30 years (of which I am aware) when state-level “leaders” desire to artificially inflate their (most often) state-level test scores.

More specifically, this most often occurs when tests are made easier “behind the scenes” so that more students pass state-level tests over time. This is most often done because state’s (and state leaders) cannot reasonably fail “too many” students. This happens…when tests are administered and “too many” students often fail…which leads to newspaper headlines about students in states not academically proficient (as uber-arbitrarily defined)…which leads to easier tests after the most difficult items are removed from the tests…which yields new headlines that the students and their teachers are taking more seriously the standards, hence the observed increases…which leads to more “behind the scenes” manipulation…which again leads to higher observed test scores…which ultimately leads to a new state leader/politician calling for higher standards because “too many” students are passing the tests…which leads to higher standards and new and improved tests to hold educators accountable for meeting higher standards…which leads to a continuous repeat cycle.

This is what we in the testing/assessment field call “the saw-tooth effect or phenomenon,” (see more about this, as per University of Colorado – Boulder Professor Emeritus Robert Linn here). This (typically) only occurs with state-level tests that are highly manipulable, both “behind the scenes” but also in practice (e.g., via “teaching to the test”).

…back to the prior post:

  • “So now that we have established a Method, how about a Motive? Here is where it gets really interesting and raises possibilities of scandal, cover-your-ass politics and even possible criminal activity. In politics timing is everything. So the question must be asked, WHAT was the reason the state deliberately refused to inform schools about their “new” methods? WHAT was occurring at precisely the same time the state [Department of Education] was making their last-second announcement about not delivering TCAP scores? If the unaltered, un-narrowed, un-post-equated TCAP scores were really bad, and were released as required by law…what would have been the result?” Why such the concern? Here is one possible answer: Arne Duncan was coming to town. “The [Tennessee] governor and Huffman were hosting 400 education writers and [Duncan]…the Obama Secretary of Education in Nashville…the exact time TCAP scores were supposed to be delivered to the schools. Can you imagine how embarrassing it would have been if 400 education writers from around the country were exposed to horrible test scores from Tennessee’s much-claimed Common Core “success”? The governor and Huffman would look like fools AND liars. And the public may have caught onto their scam. And THAT, ladies & gentlemen of the jury, is what you would need to convict someone of mis-feasance, mal-feasance, stupid-feasance and bend-over-here-it-comes-feasance.”

While I am not a conspiracy theorist, and I like to believe people and politicians are at least born of good nature, I also know that we have seen cases like this all too often in the past, WHEN test scores matter and ESPECIALLY WHEN politicians have their political lives and futures on the “results trump all” line. It is too often the case that they “must” engage in Machiavellian behaviors to save, if not extend their political careers.

On that note, in the aforementioned op-ed piece another author has come to a similar conclusion for what it’s worth, writing: “I think this year’s TCAP had multiple mistakes on it, and as state officials gasp in growing horror at low TCAP scores, they’re backpedaling, waiver-giving and post-equating their way [out of] a bureaucratic nightmare.” The author of this piece went as far as asking for Huffman’s resignation.

That being said, I will leave you all to be the judges here, as I have offered everything I have given this (crazy, but in many ways unfortunately familiar) situation in Tennessee. Those of you in Tennessee — please add to the discussion as well.

 

The Study that Keeps on Giving…(Hopefully) in its Final Round

In January I wrote a post about “The Study that Keeps on Giving…” Specifically, this post was about the study conducted and authored by Raj Chetty (Economics Professor at Harvard), John Friedman (Assistant Professor of Public Policy at Harvard), and Jonah Rockoff (Associate Professor of Finance and Economics at Harvard) that was published first in 2011 (in its non-peer-reviewed and not even internally reviewed form) by the National Bureau of Economic Research (NBER) and then published again by NBER in January of 2014 (in the same form) but this time split into two separate studies (see them split here and here).

Their re-release of the same albeit split study was what prompted the title of the initial “The Study that Keeps on Giving…” post. Little did I know then, though, that the reason this study was re-released in split form was that it was soon to be published in a peer-reviewed journal. Its non-peer-reviewed publication status was a major source of prior criticism. While journal editors seemed to have suggested the split, NBER seemingly took advantage of this opportunity to publicize this study in two forms, regardless and without prior explanation.

Anyhow, this came to my attention when the study’s lead author – Raj Chetty – emailed me a few weeks ago, emailed Diane Ravitch on the same email, and also apparently emailed other study “critics” at the same time (see prior reviews of this study as per this study’s other notable “critics” here, here, here, and here) to notify all of us that this study made it through peer review and was to be published in a forthcoming issue of the American Economic Review. While Diane and I responded to our joint email (as other critics may have done as well), we ultimately promised Chetty that we would not share the actual contents of any of the approximately 20 email exchanges that went back and forth among the three of us over the following days.

What I can say, though, is that no genuine concern was expressed by Chetty or on behalf of his co-authors, in particular, about the intended or unintended consequences that came about as a result of his study, nor how many policymakers since used and abused study results for political gain and the further advancement of VAM-based policies. Instead, emails were more or less self-promotional and celebratory, especially given that President Obama cited the study in his 2012 State of the Union Address and that Chetty apparently continues to advise U.S. Secretary of Education Arne Duncan about his similar VAM-based policies. Perhaps, next, the Nobel prize committee might pay this study its due respects, now overdue, but again I only paraphrase from that which I inferred from these email conversations.

As a refresher, Chetty et al. conducted value-added analyses on a massive data set (with over 1 million student-level test and tax records) and presented (highly-questionable) evidence that favored teachers’ long-lasting, enduring, and in some cases miraculous effects. While some of the findings would have been very welcomed to the profession, had they indeed been true (e.g., high value-added teachers substantively affect students incomes in their adult years), the study’s authors overstated their findings, and they did not duly consider (or provide evidence to counter) the alternative hypotheses in terms of what other factors besides teachers might have caused the outcomes they observed (e.g., those things that happen outside of schools while students are in school and throughout students’ lives).

Nor did they consider, or rather satisfactorily consider, how the non-random assignment of students into both schools and classrooms might have biased the effects observed, whereas the students in high “value-added” teachers’ classrooms might have been more “likely to succeed” regardless of, or even despite the teacher effect, on both the short and long term effects demonstrated in their findings…then widely publicized via the media and beyond throughout other political spheres.

Rather, Chetty et al. advanced what they argued were a series of causal effects by exploiting a series of correlations that they turned attributional. They did this because (I believe) they truly believe that their sophisticated econometric models and the sophisticated controls and approaches they use in fact work as intended. Perhaps this also explains why Chetty et al. give pretty much all credit in the area of value-added research to econometricians, and they do this throughout their papers, all the while over-citing the works of their economic researchers/friends but not the others (besides Berkeley economist Jesse Rothstein, see the full reference to his study here) who have also outright contradicted their findings, with evidence. Apparently, educational researchers do not have much to add on this topic, but I digress.

But this is too a serious fault as “they” (and I don’t mean to make sweeping generalizations here) have never been much for understanding what goes into the data they analyze, as socially constructed and largely context dependent. Nor do they seem to care to fully understand the realities of the classrooms from which they receive such data, or what test scores actually mean, or when using them what one can and cannot actually infer. This, too, was made clear via our email exchange. It seems this from-the-sky-down view of educational data is the best (as well as the most convenient) approach that “they” might even expressly prefer, so that they do not have to get their data fingers dirty and deal with the messiness that always surrounds these types of educational data and always comes into play when conducting most any type of educational research that relies (in this case solely) on students’ large-scale standardized test scores.

Regardless, I decided to give this study yet another review to see if, now that this study has made it through the peer review process, I was missing something. I wasn’t. The studies are pretty much exactly the same as they were when first released (which unfortunately does not say much for peer review). The first study here is about VAM-based bias and how VAM estimates that control for students’ prior test scores “exhibit little bias despite the grouping of students” and despite the number of studies not referenced or cited that continue to evidence the opposite. The second study here is about teacher-level value-added and how teachers with a lot of it (purportedly) cause grander things throughout their students’ lives. More specifically, they found that “students [non-randomly] assigned to high [value-added] teachers are more likely to attend college, earn higher salaries, and are less likely to have children as teenagers.” They also found that “[r]eplacing a teacher whose [value-added] is in the bottom 5% with an average teacher would increase the present value of students’ lifetime income by approximately $250,000 per classroom [emphasis added].” Please note that this overstated figure is not per student; had it been broken out by student it would have rather become “chump change,” for the lack of a better term, which serves as one example of just one of their classic exaggerations. They do, however, when you read through the actual text, tone their powerful language down a bit to note that, on average, this is more accurately $185,000, still per classroom. Again, to read the more thorough critiques conducted by scholars with also impressive academic profiles, I suggest readers click here, here, here, or here.

What I did find important to bring to light during this round of review were the assumptions that, thanks to Chetty and his emails, were made more obvious (and likewise troublesome) than before. These are the, in some cases, “very strong” assumptions that Chetty et al. make explicit in both of their studies (see Assumptions 1-3 in the first and second papers). These are also the assumptions they make explicit, with “evidence” why they should not reject these assumptions (most likely, and in some cases clearly) because their study relied on such assumptions. The assumptions they made were so strong, in fact, at one point they even mention that it would be useful could they have “relaxed” some of the assumptions they made. In other cases, they justify their adoption of these assumptions given the data limitations and methodological issues they faced, plain and simply because there was no other way to conduct (or continue) their analyses without making and agreeing to these assumptions.

So, see if you agree with the following three assumptions they make most explicit and use throughout both studies (although other assumptions are littered throughout both pieces), yourselves. I would love for Chetty et al. to discuss whether their assumptions in fact hold given the realities the everyday teacher, school, or classroom face. But again, I digress…

Assumption 1 [Stationarity]: Teacher levels of value-added as based on growth in student achievement over time follows a stationary, unchanging, constant, and consistent process. On average, “teacher quality does not vary across calendar years and [rather] depends only on the amount of time that elapses between” years. While completely nonsensical to the average adult with really any commonsense, this assumption, to them, helped them “simplif[y] the estimation of teacher [value-added] by reducing the number of parameters” needed in their models, or more appropriately needed to model their effects.

Assumption 2 [Selection on Excluded Observables]: Students are sorted or assigned to teachers on excluded observables that can be estimated. See a recent study that I conducted with a doctoral student of mine (that was just published in this month’s issue of the highly esteemed American Educational Research Journal here) in which we found, with evidence, that 98% of the time this assumption is false. Students are non-randomly sorted on “observables” and “non-observables” (most of which are not and cannot be included in such data sets) 98% of the time; both types of variables bias teacher-level value-added over time given the statistical procedures meant to control for these variables do not work effectively well, especially for students in the extremes or on both sides of the normal bell curve. While convenient, especially when conducting this type of far-removed research, this assumption is false and cannot really be taken seriously given the pragmatic realities of schools.

Assumption 3 [Teacher Switching as a Quasi-Experiment]: Changes in teacher-level value-added scores across cohorts within a school-grade are orthogonal (i.e., non-overlapping, uncorrelated, or independent) with changes in other determinants of student scores. While Chetty et al. themselves write that this assumption “could potentially be violated by endogenous student or teacher sorting to schools over time,” they also state that “[s]tudent sorting at an annual frequency is minimal because of the costs of changing schools” which is yet another unchecked assumption without reference(s) in support. They further note that “[w]hile endogenous teacher sorting is plausible over long horizons, the high-frequency changes [they] analyze are likely driven by idiosyncratic shocks such as changes in staffing needs, maternity leaves, or the relocation of spouses.” These are all plausible assumptions too, right? Is “high-frequency teacher turnover…uncorrelated with student and school characteristics?” Concerns about this and really all of these assumptions, and ultimately how they impact study findings, should certainly cause pause.

My final point, interestingly enough, also came up during the email exchanges mentioned above. Chetty made the argument that he, more or less, had no dog in the fight surrounding value-added. In the first sentence of the first manuscript, however, he (and his colleagues) wrote, “Are teachers’ impacts on students’ test scores (“value-added”) a good measure of their quality?” The answer soon thereafter and repeatedly made known in both papers becomes an unequivocal “Yes.” Chetty et al. write in the first paper that “[they] established that value-added measures can help us [emphasis added as “us” is undefined] identify which teachers have the greatest ability to raise students’ test scores.” In the second paper, they write that “We find that teacher [value-added] has substantial impacts on a broad range of outcomes.”  Apparently, Chetty wasn’t representing his and/or his colleagues’ honest “research-based” opinions and feelings about VAMs in one place (i.e., our emails) or the other (his publications) very well.

Contradictions…as nettlesome as those dirtly little assumptions I suppose.

UCLA Professor Emeritus W. James Popham: His Testing and Teacher Evaluation Infomercial

W. James Popham, Emeritus Professor in the Graduate School of Education at the University of California, Los Angeles (UCLA), is best known for his decades of research on testing and assessment. Many of you might be familiar with his work have you ever read his classic textbook on assessment: Classroom Assessment: What Teachers Need to Know. 

He is also best known for having quite a sense of humor. Check out a new infomercial he produced about tests, tests as they relate to teacher evaluation (e.g., of the growth and value-added kind), and teacher observational systems as they too relate to teacher evaluation systems as currently being based on “multiple measures.”

The whole infomercial is just over 10 minutes, but do give it your full attention. There were parts I literally laughed out loud, although at the same time I could not get over the irony. Enjoy!

Tennessee Commissioner Huffman’s Accountability Immunity

Following up on my most recent post, about the video capturing Tennessee’s Education Commissioner Kevin Huffman’s “Inspiring” TEDxNashville Talk, a VAMboozled! follower sent me a follow-up article linked here, “about the nightmare schools here in Tennessee have had with big testing/data [over] the past two days.”

Here’s the summary of the situation, although as captured in a different article written in The Tennessean linked here and titled: “Tennessee to Let Schools [emphasis added] Out of TCAP [Tennessee Comprehensive Assessment Program] Requirement Due to Score Delay.”

It seems that “A move by [Huffman and his Tennessee] Department of Education to make exams better aligned to Common Core standards has delayed the release of [its TCAP] test scores…[T]he unexpected 10-day postponement will mean a four-year-old law designed to give more meaning to [the state’s TCAP] standardized tests won’t be applied to many students, “as these test scores are to “account for 15 to 25 percent of Tennessee students’ final grades.”

“State officials first alerted Tennessee’s school directors of the delay on Tuesday afternoon, explaining that the state had narrowed assessments this year to eliminate factors not aligned with state standards and needed 10 more days to thoroughly review the results.” That is, state officials narrowed the test to (simply) remove items/portions from the tests that didn’t align with Common Core. As they are now figuring out, this was not as simple as it first may have seemed. Likewise, this should have been done with much more care and many more at the table (e.g., teachers for content validity) while doing so. It also seems that the state made these changes without keeping districts informed about the changes that were being made.

Removing items/portions from tests invalidates pretty much everything about them (noting that for what they were valid in even their most perfect forms is typically suspect). In addition, doing this requires complex, additional analyses to put them back into their “best” and “most valid” forms. Doing this also likely means that, now, because of their simplistic approach, behind the scenes state officials are likely dealing with some serious chaos in the test scores that came about as a result.

Education Commissioner Huffman’s role in all of this? Tennessee House Democratic Leader replied, “While Commissioner Huffman has pushed for more and more accountability for our teachers, his own department has yet to be called to account for their own failures.” A local Superintendent replied, “How do we consider the actions of [Huffman and the Tennessee Department of Education] consistent with any reasonable and prudent management decision that we all expect and require at all levels of effective governance?” It doesn’t seem like Huffman follows his own mantra, that his own “results [should] trump all” (see this/his mantra discussed in more depth in the aforementioned post).

Huffman’s response, not to these particular comment but to the situation overall? “If [districts] need to ask for a waiver, they can…Our goal is to get accuracy [emphasis added] over speed.” An apparent tweak to the mantra I suppose.

I don’t believe, from Huffman’s or the Tennessee Department of Education’s perspectives, that this will impact teacher evaluations, but it should. Messing with tests in such a way, without putting high-stakes accountability on hold until the tests are once again re-calibrated and validated is certainly cause for pause, if not at least sub-reasons for the lawsuits likely to be added to the three already in play.

Tennessee’s Education Commissioner Gives “Inspiring” TEDxNashville Talk

The purpose of TED talks is, and since their inception in 1984 has been, to share “ideas worth spreading, “in the most innovative and engaging” of ways. It seems that somebody in the state of Tennessee hijacked this idea, however, and put the Tennessee Education Commissioner (Kevin Huffman) on the TEDxNashville stage, to talk about teachers, teacher accountability, and why teachers need to “work harder” and be held more accountable for meeting higher standards if they don’t.

Watch the video here:

And/Or read the accompanying article here. But I have summarized the key points, as I see them, for you all below in case you’d rather not view/read for yourself.

Before we begin, though, I should make explicit that Commissioner Huffman was formerly an executive with Teach for America (TFA), and it was from his “grungy, non-profit cubicle in DC” where the Governor of Tennessee picked him and ultimately placed him into the state Commissioner position. So, as explained by him, he “wasn’t a total novice” in terms of America’s public schools because 1) he was in charge of policy and politics at TFA, 2) he taught in Houston for three years through TFA, and 3) he brought with him to Tennessee prior experience “dealing” with Capitol Hill. All of this (and a law degree) made him qualified to serve as Tennessee’s Education Commissioner.

This background knowledge might help others understand from where his (misinformed and quite simply uninspiring) sense of “reality” about teachers in America’s, and in particular Tennessee’s public schools comes. This should also help to explain some of his comments on which he simply “struck out” – as demonstrated via this video. Oh – and he was previously married to Michelle Rhee, head of StudentsFirst, former Chancellor of Washington D.C.’s public schools, and the source of other VAMboozled! posts here, here, and here. So his ideas/comments, or “strikes,” as I’ve called them below, might actually make sense in this context.

Strike 1 – Huffman formerly wrote an education column for The Washington Post during which he endured reader’s “anonymous…boo and hiss” comments. This did not change his thinking, however, but rather helped him develop the “thick skin” he needed in preparation for his job as Commissioner. This was yet another “critical prerequisite” for his future leadership role, not to ingest reader feedback and criticism for what it was worth, but rather to reject it and use it as justification to stay the course…and ultimately celebrate this self-professed obsession, as demonstrated as follows.

Strike 2 – He has a self-described “nerdy obsession” with data, and results, and focusing on data that lead to results. While he came into Tennessee as a “change agent,” however, he noticed that a lot of the changes he desired were already in place…and had already been appropriately validated by the state of Tennessee “winning” the Race to the Top competition and receiving the $500 million that came along with “winning” it. What he didn’t note, however, was that while his professed and most important belief was that “results trump all,” the state of Tennessee had been carrying this mantra since the early 1990s when they adopted the Tennessee Value-Added Assessment System (TVAAS) for increased accountability. This was well before he probably even graduated college.

Regardless, he noted that nothing had been working in Tennessee for all of this time, until he came into his leadership position and forced this policy’s fit. “If you can deliver unassailable evidence [emphasis added] that students are learning more and students are benefiting, then those results [should] carry the day.” He did not once mention that the state’s use of the TVAAS was THE reason the state of Tennessee was the first to win Race to the Top funds…and even though they had been following the same “results trump all” mantra since the early 1990s, Tennessee was still ranked 44th in the nation all of those years later, and in some areas lower in rank in terms of its national levels of achievement (i.e., on the National Assessment of Educational Progress [NAEP]) twenty-years in… when he took office. Regardless of the state’s history, however, Commisioner Huffman went all in on “raising academic standards” and focusing on the state’s “evaluation” system. While “there was a bunch of work” previously done in Tennessee on this note, he took these two tenets “more seriously” and “really drove [them] right in.”

Strike 3 – When Huffman visited all 136 school districts in the state after he arrived, and thereafter took these things “more seriously,” there were soon after “all of these really good signs” that things were working. The feedback he got in response to his initiatives was all [emphasis added] positive. He kept hearing things like, “kids are [now] learning more [and] instruction is getting better,” thanks to him, and he would hear the same feedback regardless of whether people liked him and/or his “results trump all” policies. More shocking here, though, were his self-promotional pieces given there are currently three major lawsuits in the state, all surrounding the state’s teacher evaluation system as carried forth by him. This is the state, in fact, “winning” the “Race to the Lawsuit” competition, should one currently exist.

In terms of the hard test-based evidence, though, in support of his effectiveness, while the state tests were demonstrating the effectiveness of his policies, Huffman was really waiting for the aforementioned NAEP results to see if what he was doing was working. And here’s what happened: two years after he arrived “Tennessee’s kids had the most growth of any kids in America.”

But here’s what REALLY happened.

While Sanders (the TVAAS developer who first convinced the state legislature to adopt his model for high-stakes accountability purposes in the 1990s) and others (including U.S. Secretary of Education Arne Duncan) also claimed that Tennessee’s use of accountability instruments caused Tennessee’s NAEP gains (besides the fact that the purported gains were over two decades delayed), others have since spoiled the celebration because 1) the results also demonstrated an expanding achievement gap in Tennessee; 2) the state’s lowest socioeconomic students continue to perform poorly, despite Huffman’s claims; 3) Tennessee didn’t make gains significantly different than many other states; and 4) other states with similar accountability instruments and policies (e.g., Colorado, Louisiana) did not make similar gains, while states without such instruments and policies (e.g., Kentucky, Iowa, Washington) did. I should add that Kentucky’s achievement gap is also narrowing and their lowest socioeconomic students have made significant gains. This is important to note as Huffman repeatedly compares his state to theirs.

Claiming that NAEP scores increased because of TVAAS-use, and other stringent accountability policies developed by Huffman, was and continues to be unwarranted (see also this article in Education Week).

So my apologies to Tennessee because it appears that TEDx was a TED strike-out this time. Tennessee’s Commissioner struck out, and the moot pieces of evidence supporting these three strikes will ultimately unveil their true selves in the very near future. For now, though, there is really nothing to celebrate, even if Commissioner Huffman brings you cake to convince you otherwise (as referenced in the video).

In the end, this – “the Tennessee story” (as Huffman calls it) – reminds me of a story from way back in 2002 when then Governor George W. Bush spoke about “the Miracle in Texas” in an all too familiar light…he also ultimately “struck out.” Bush talked then of all of the indicators, similar to here, that were to be celebrated thanks to his (which was really Ross Perot’s) similar high-stakes policy initiative. While the gains in Texas were at-the-same-time evidenced to have been artificially inflated, this never hit the headlines. Rather, these artificial results helped President George W. Bush advance No Child Left Behind (NCLB) at the national level.

As we are all now aware, or should be now aware, NCLB never demonstrated its intended consequences now 10 years past, but NCLB demonstrated, rather, only negative, unintended consequences instead. In addition, the state of Texas in which such a similar miracle occurred, and where NCLB was first conceived, is now performing about average as compared to the nation…and losing ground.

Call me a cynic, call me a realist..but there it is.

EVAAS’s SAS Inc.: “The Frackers of the Educational World”

David Patten, a former history teacher, college instructor, and author, recently wrote an excellent article for the History News Network about VAMs in his state of Ohio (another state that uses the Education Value-Added Assessment System [EVAAS] statewide). He writes about what the state of Ohio is getting in terms of its bang for its buck, at a rate of $2.3 million bucks per year. Just to be clear, this includes the costs to calculate just the state’s value-added estimates, as based on the state’s standardized tests, and this does not include what the state also pays yearly for its standardized tests, in and of themselves.

You can read the full article here, but here are some of the key highlights as they directly pertain to VAMs in Ohio.

Patten explains that Ohio uses a five level model that combines teachers’ EVAAS scores with scores derived via their administrators’ observations into what is a 50/50 teacher evaluation model, that ultimately results in a four category teacher ranking system including the following teacher quality categories: 1. Ineffective, 2. Developing, 3. Skilled, and 4. Accomplished. While Ohio is currently using its state tests, it is soon to integrate and likely replace these with the Common Core tests tests and/or tests purchased from “approved vendors.”

As for the specifics of model, however, he writes that the EVAAS system (as others have written extensively) is pretty “mysterious” beyond that – the more or less obvious.

What exactly is the mathematical formula that will determine the fate of our teachers and our educational systems? Strangely enough, only the creators know the mysterious mix; and they refuse to reveal it.

The dominant corporation in the field of value added is SAS, a North Carolina company. Their Value Added Assessment and Research Manager is Dr. William Sanders [the topic of a previous post here] who is also the primary designer of their model. While working at the University of Tennessee, his remarkable research into agricultural genetics and animal breeding inspired the very model now in use for teacher evaluation. The resultant SAS [EVAAS] formula boasts a proprietary blend of numbers and probabilities. Since it is a closely guarded intellectual property, it becomes the classic enigma wrapped up in taxpayer dollars. As a result, we are urged to take its validity and usefulness as an article of faith. SAS and their ilk have, in fact, become the frackers of the educational world [emphasis added.] They propose to drill into our educational foundations, inject their algorithmic chemicals into our students and instructors, and just like the frackers of the oil and gas world, demand that we trust them to magically get it right.

Strangely enough, Ohio is not worried about this situation. Indeed, no one at the Ohio Department of Education has embraced even the pretense of understanding the value added model it adopted. Quite to the contrary, they admitted to never having seen the complete model, let alone analyzing it. They have told us that it does not matter, for they do not need to understand it. In their own words, they have chosen to “rely upon the expertise of people who have been involved in the field.” Those are remarkable words and admissions and they are completely consistent with an educational bureaucracy sporting the backbone of an éclair.

In terms of dollars and cents, trust comes at a very high price. Ohio will pay SAS, Inc. an annual fee of 2.3 million dollars to calculate value added scores. I found very similar fees in the other states making use of their proprietary expertise.

Should we be afraid of this mystical undertaking? Of course not, instead, we should be terrified. Not only are we stumbling into the dark, unseen room and facing all the horror that implies, but the research into the effectiveness of the model shows it to be as educationally decrepit as the high stakes testing upon which it is based…

…Mark Twain supposedly said, “Sometimes I wonder whether the world is being run by smart people who are just putting us on or by imbeciles who really mean it.” Whether the value added advocates are smart people or imbeciles is unknown to me. What is known to me is that value added has no value. Through it and through standardized testing we have become the architects of an educational system of breathtaking mediocrity. One more thing is abundantly clear; no student and no teacher should ever accept a ride from the “Value Added Valkyries.”

To read more, also about the research Patten highlights to substantiate his claims, again, click here.

 

This State Pays a Company $2.3 Million to Rank Teachers by an Algorithmic Formula? – See more at: http://www.hnn.us/article/155515#sthash.LhlVpLyH.dpuf

 

David Patten is an award-winning history teacher, college lecturer, and the author of articles – See more at: http://www.hnn.us/article/155515#sthash.LhlVpLyH.dpuf

David Patten is an award-winning history teacher, college lecturer, and the author of articles – See more at: http://www.hnn.us/article/155515#sthash.LhlVpLyH.dpuf

David Patten is an award-winning history teacher, college lecturer, and the author of articles – See more at: http://www.hnn.us/article/155515#sthash.LhlVpLyH.dpuf

 

 

VAMs and the “Dummies In Charge:” A Clever “Must Read”

Peter Greene wrote a very clever, poignant, and to the point article about VAMs, titled “VAMs for Dummies” in The Blog of The Huffington Post. While I already tweeted, facebooked, shared, and else short of printing this one out for my files, I thought it imperative I also share it out with you. Also – Greene gave us at VAMboozled a wonderful shout out directing readers here to find out more. So Peter — even though I’ve never met you — thanks for the kudos and keep it coming. This is a fabulous piece!

Click here to read his piece in full. I’ve also pasted it below, mainly because this one is a keeper. See also a link to a “nifty” 3-minute video on VAMs below.

If you don’t spend every day with your head stuck in the reform toilet, receiving the never-ending education swirly that is school reformy stuff, there are terms that may not be entirely clear to you. One is VAM — Value-Added Measure.

VAM is a concept borrowed from manufacturing. If I take one dollar’s worth of sheet metal and turn it into a lovely planter that I can sell for ten dollars, I’ve added nine dollars of value to the metal.

It’s a useful concept in manufacturing management. For instance, if my accounting tells me that it costs me ten dollars in labor to add five dollars of value to an object, I should plan my going-out-of-business sale today.

And a few years back, when we were all staring down the NCLB law requiring that 100 percent of our students be above average by this year, it struck many people as a good idea — let’s check instead to see if teachers are making students better. Let’s measure if teachers have added value to the individual student.

There are so many things wrong with this conceptually, starting with the idea that a student is like a piece of manufacturing material and continuing on through the reaffirmation of the school-is-a-factory model of education. But there are other problems as well.

1) Back in the manufacturing model, I knew how much value my piece of metal had before I started working my magic on it. We have no such information for students.

2) The piece of sheet metal, if it just sits there, will still be a piece of sheet metal. If anything, it will get rusty and less valuable. But a child, left to its own devices, will still get older, bigger, and smarter. A child will add value on its own, out of thin air. Almost like it was some living, breathing sentient being and not a piece of raw manufacturing material.

3) All piece of sheet metals are created equal. Any that are too not-equal get thrown in the hopper. On the assembly line, each piece of metal is as easy to add value to as the last. But here we have one more reformy idea predicated on the idea that children are pretty much identical.

How to solve these three big problems? Call the statisticians!

This is the point at which that horrifying formula that pops up in these discussion appears. Or actually, a version of it, because each state has its own special sauce when it comes to VAM. In Pennsylvania, our special VAM sauce is called PVAAS [i.e., the EVAAS in Pennsylvania]. I went to a state training session about PVAAS in 2009 and wrote about it for my regular newspaper gig. Here’s what I said about how the formula works at the time:

PVAAS uses a thousand points of data to project the test results for students. This is a highly complex model that three well-paid consultants could not clearly explain to seven college-educated adults, but there were lots of bars and graphs, so you know it’s really good. I searched for a comparison and first tried “sophisticated guess;” the consultant quickly corrected me–“sophisticated prediction.” I tried again–was it like a weather report, developed by comparing thousands of instances of similar conditions to predict the probability of what will happen next? Yes, I was told. That was exactly right. This makes me feel much better about PVAAS, because weather reports are the height of perfect prediction.

Here’s how it’s supposed to work. The magic formula will factor in everything from your socio-economics through the trends over the past X years in your classroom, throw in your pre-testy thing if you like, and will spit out a prediction of how Johnny would have done on the test in some neutral universe where nothing special happened to Johnny. Your job as a teacher is to get your real Johnny to do better on The Test than Alternate Universe Johnny would.

See? All that’s required for VAM to work is believing that the state can accurately predict exactly how well your students would have done this year if you were an average teacher. How could anything possibly go wrong??

And it should be noted — all of these issues occur in the process before we add refinements such as giving VAM scores based on students that the teacher doesn’t even teach. There is no parallel for this in the original industrial VAM model, because nobody anywhere could imagine that it’s not insanely ridiculous.

If you want to know more, the interwebs are full of material debunking this model, because nobody — I mean nobody — believes in it except politicians and corporate privateers. So you can look at anything from this nifty three minute video to the awesome blog Vamboozled by Audrey Amrein-Beardsley.

This is one more example of a feature of reformy stuff that is so top-to-bottom stupid that it’s hard to understand. But whether you skim the surface, look at the philosophical basis, or dive into the math, VAM does not hold up. You may be among the people who feel like you don’t quite get it, but let me reassure you — when I titled this “VAM for Dummies,” I wasn’t talking about you. VAM is always and only for dummies; it’s just that right now, those dummies are in charge.

 

Florida’s VAM-Based Evaluation System Ruled “Unfair but Not Unconstitutional”

From Diane Ravitch’s blog comes an important update re: on the lawfulness of VAM-based systems that I want to be sure readers of this blog didn’t miss.

She writes: “A federal judge in Florida dismissed a lawsuit against the state evaluation system, declaring that it was unfair to rate teachers based on the scores of students they never taught but not unconstitutional.

The evaluation system may be stupid; it may be irrational; it may be unfair; but it does not violate the Constitution. So says the judge.

An article in the Florida Education Association newsletter described the ruling:

“The federal lawsuit, known as Cook v. Stewart, was filed last year by the FEA, the National Education Association and seven accomplished teachers and the local education associations in Alachua, Escambia and Hernando counties. The lawsuit challenged the evaluation of teachers based on the standardized test scores of students they do not teach or from subjects they do not teach. They brought suit against the Florida commissioner of education, the State Board of Education and the school boards of those three counties, who have implemented the evaluation system to comply with 2011’s Senate Bill 736.

“On Tuesday afternoon, U.S. District Judge Mark Walker dismissed FEA’s challenges to the portions of SB 736 that call for teachers to be evaluated based upon students and/or subjects the teachers do not teach, though he expressed reservations on the practice.

We are disheartened by the judge’s ruling. Judge Walker acknowledged the many problems with this evaluation system, but he ruled that they did not meet the standard to be declared unconstitutional. We are evaluating what further steps we might take in this legal process.

Judge Walker indicated his discomfort with the evaluation process in his order.

“The unfairness of the evaluation system as implemented is not lost on this Court,” he wrote. “We have a teacher evaluation system in Florida that is supposed to measure the individual effectiveness of each teacher. But as the Plaintiffs have shown, the standards for evaluation differ significantly. FCAT teachers are being evaluated using an FCAT VAM that provides an individual measurement of a teacher’s contribution to student improvement in the subjects they teach.” He noted that the FCAT VAM has been applied to teachers whose students are tested in a subject that teacher does not teach and to teachers who are measured on students they have never taught, writing that “the FCAT VAM has been applied as a school-wide composite score that is the same for every teacher in the school. It does not contain any measure of student learning growth of the … teacher’s own students.”

In his ruling, Judge Walker indicated there were other problems.

“To make matters worse, the Legislature has mandated that teacher ratings be used to make important employment decisions such as pay, promotion, assignment, and retention,” he wrote. “Ratings affect a teacher’s professional reputation as well because they are made public — they have even been printed in the newspaper. Needless to say, this Court would be hard-pressed to find anyone who would find this evaluation system fair to non-FCAT teachers, let alone be willing to submit to a similar evaluation system.”

“This case, however, is not about the fairness of the evaluation system,” Walker wrote. “The standard of review is not whether the evaluation policies are good or bad, wise or unwise; but whether the evaluation policies are rational within the meaning of the law. The legal standard for invalidating legislative acts on substantive due process and equal protection grounds looks only to whether there is a conceivable rational basis to support them,” even though this basis might be “unsupported by evidence or empirical data.”

Saturday’s Book Presentation

This past Saturday, those involved with Arizona State University’s edXchange initiative invited me to speak on VAMs and my new book, Rethinking Value-Added Models: Critical Perspectives on Tests and Assessment-Based Accountability.

I would venture to say that most professional fields wouldn’t attract many Saturday lecture attendees. But, as most of you know, educators are certainly not the norm! We had a house full of educators from every corner of the field—classroom teachers, school administrators, school board members, college professors, parents, and everyone else you can imagine. Needless to say, it was an honor to share my work and engage in dialogue with so many concerned citizens. I left hopeful that more were informed, and in many ways armed, to help others make more informed decisions, at least in Arizona’s schools.

Thank you to those who were able to attend, and thank you all for continuing to do your part in improving the lives of our teachers and students, hopefully throughout the country.

See a few photos from the event below.

edXchangeBookTalk6

edXchangeBookTalk  edXchangeBookTalk5