Contradictory Data Complicate Definitions of, and Attempts to Capture, “Teacher Effects”

Please follow and like us:

An researcher from Brown University — Matthew Kraft — and his student, recently released a “working paper” in which I think you all will (and should) be interested. The study is about “…Teacher Effects on Complex Cognitive Skills and Social-Emotional Competencies” — those effects that are beyond “just” test scores (see also a related article on this working piece released in The Seventy Four). This one is 64-pages long, but here are the (condensed) highlights as I see them.

The researchers use data from the Bill & Melinda Gates Foundations’ Measures of Effective Teaching (MET) Project “to estimate teacher effects on students’ performance on cognitively demanding open-ended tasks in math and reading, as well as their growth mindset, grit, and effort in class.” They find “substantial variation in teacher effects on complex task performance and social-emotional measures. [They] also find weak relationships between teacher effects on state standardized tests, complex tasks, and social-emotional competencies” (p. 1).

More specifically, researchers found that: (1) “teachers who are most effective at raising student performance on standardized tests are not consistently the same teachers who develop students’ complex cognitive abilities and social-emotional competencies” (p. 7); (2) “While teachers who add the most value to students’ performance on state tests in math do also appear to strengthen their analytic and problem-solving skills, teacher effects on state [English/language arts] tests are only moderately correlated with open-ended tests in reading” (p. 7); and (3) “[T]eacher effects on social-emotional measures are only weakly correlated with effects on state achievement tests and more cognitively demanding open-ended tasks (p. 7).

The ultimate finding, then, is that “teacher effectiveness differs across specific abilities” and definitions of what it means to be an effective teacher (p. 7). Likewise, authors concluded that really all current teacher evaluation systems, also given those included within the MET studies are/were some of the best given the multiple sources of data MET researchers included (e.g., traditional tests, indicators capturing complex cognitive skills and social-emotional competencies, observations, student surveys), are not mapping onto similar definitions of teacher quality or effectiveness, as “we” have and continue to theorize.

Hence, attaching high-stakes consequences to data, especially when multiple data yield contradictory findings as based on how one might define effective teaching or its most important components (e.g., test scores v. affective, socio-emotional effects), is (still) not yet warranted, whereby an effective teacher here might not be an effective teacher there, even if defined similarly in like schools, districts, or states. As per Kraft, “while high-stakes decisions may be an important part of teacher evaluation systems, we need to decide on what we value most when making these decisions rather than just using what we measure by default because it is easier.” Ignoring such complexities will not make standard, uniform, or defensible some of the high-stakes decisions that some states are still wanting to attach to such data derived via  multiple measures and sources, given data defined differently even within standard definitions of “effective teaching” continue to contradict one another.

Accordingly, “[q]uestions remain about whether those teachers and schools that are judged as effective by state standardized tests [and the other measures] are also developing the skills necessary to succeed in the 21st century economy.” (p. 36). Likewise, it is not necessarily the case that teachers defined as high value-added teachers, using these common indicators, are indeed high value-added teachers given they are the same teachers defined as low value-added teachers when different aspects of “effective teaching” are also examined.

Ultimately, this further complicates “our” current definitions of effective teaching, especially when those definitions are constructed in policy arenas oft-removed from the realities of America’s public schools.

Reference: Kraft, M. A., & Grace, S. (2016). Teaching for tomorrow’s economy? Teacher effects on complex cognitive skills and social-emotional competencies. Providence, RI: Brown University. Working paper.

*Note: This study was ultimately published with the following citation: Blazar, D., & Kraft, M. A. (2017). Teacher and teaching effects on students’ attitudes and behaviors. Educational Evaluation and Policy Analysis, 39(1), 146 –170. doi:10.3102/0162373716670260. Retrieved from

1 thought on “Contradictory Data Complicate Definitions of, and Attempts to Capture, “Teacher Effects”

  1. The MET data and indicators are so flawed and ideologically skewed, in addition to being not relevant to the job assignments of many teachers, that the Brown study strikes me as another case of begging serious questions. On the matter of SEL for example, there is not yet much agreement about the constructs, the degree to which teachers can and should intervene formally with planned instruction and standards– a view favored by CASEL and embedded in Indiana standards preschool to grade 12, and also being embedded in school evaluations in the works for 10 California CORE Districts, where field testing of items for SEL has been outsourced to startup Panorama Education.
    Meanwhile Duckworth and Yeager ( grit) and Dweck (mindset) send mixed messages about testing SEL.
    Recall that the observations in the MET studies were based used Danielson’s protocol, not valid for every grade and subject, and were also rated by looking at teacher-selected and recorded videos made just for the study, not live classrooms.
    Up shot, the long life of deeply flawed data from the MET study has infected too many studies. Moreover, Gates has kept those MET studies in circulation by funding doctoral students who will continue to analyze the data and publish dubious conclusions.
    SEL is being marketed as “deeper learning,” and by the rhetoric of “non-academic soft skills.” From my perspective, SEL programs and resources are really addressing character education and they represent an effort to standardize and normalize specific values as good and great for every student–not far removed from indoctrination…

Leave a Reply

Your email address will not be published. Required fields are marked *