On Rigor 2

This is a followup to my previous post on external validity and rigor, and a further attempt to pretend that this blog is not just about productivity, apps, and hacks.

Jed Friedman has a great piece on the Development Impact blog on a working paper by Hunt Alcott (who I cited in my previous post). Alcott describes a concept called “External Unconfoundedness”. This perfectly articulates what I was trying to get at in my previous post, an attempt to bring statistical notions of unbiasedness to questions of external validity. A lot of the conditions for external unconfoundedness have to do with the environment of the original study, and Alcott is particularly interested in site selection bias – the extent to which the setting for a study is chose because of favorable conditions.

Both the working paper and the post are great reads.

Link: Toward a more systematic approach to external validity: Understanding site-selection bias

On Rigor

Lant Pritchett wrote a piece for the Building State Capacity blog about the notion of “rigorous evidence.” At the risk of putting words in his mouth, my sense is that his argument boils down to this: promoters of evidence-based policy overplay their hands by focusing exclusively on internal validity[1]. He says as much in his post:

Evidence would be “rigorous” about predicting the future impact of the adoption of a policy only if the conditions under which the policy was to be implemented were exactly the same in every relevant dimension as that under which the “rigorous” evidence was generated. But that can never be so because neither economics—nor any other social science—have theoretically sound and empirically validated invariance laws that specify what “exactly the same” conditions would be.

Pritchett raises an important point; our understanding of internal validity and our methods for assessing it are far more developed than that of external validity[2]. However, I can’t help but feel that Pritchett is overplaying his hand as well. We consider a study to be internally valid if our comparison groups are equivalent in expectation, not if they are exactly the same in every relevant dimension. This may seem like mincing words, but there’s a distinction between equivalence and plausibly arguing the absence of bias. The latter is the standard to which we hold studies when assessing internal validity, and we should do the same for external validity. Still, the point remains that our understanding of external validity is far removed from even this weaker definition.

Link: Rigorous Evidence Isn’t


  1. Wikipedia’s entry on external validity vs. internal validity provides a nice overview of the tension for those unfamiliar with the concepts.  ↩

  2. A recent working paper by Hunt Alcott and Sendhil Mullainathan is an interesting foray into developing metrics for external validity. Unfortunately, these metrics seem to require a whole heap of data that is rarely available for a single intervention.  ↩

Teaching Regression Discontinuity

A very enjoyable post on the regression discontinuity study design over at the must-read Development Impact blog. I was lucky enough to introduce this design to a room full of policy master’s students this semester, and I agree completely with all of Evans’ points. In addition to being a conceptually interesting design, RD is a great way to introduce students to the rationale behind quasi-experimental studies.