The forthcoming article “Does Regression Produce Representative Estimates of Causal Effects” by Peter M. Aronow and Cyrus Samii is summarized by the authors here:
Issues of generalizability are central in current discussions about causal inference. By now, it is broadly understood that instrumental variables identify an average treatment effect that is local to the “complier” subpopulation. Regression discontinuity identifies effects that are local to the “cutoff.” Randomized experiments may be undertaken in contexts that are unusually permissive. Matching estimators mechanically discard units off the “common support.” From this, one gets the sense that by privileging internal validity, such methods yield results that lack in external validity. “Classical” regression modeling methods differ in that they typically work from datasets that are as representative as possible. It would seem that such classical methods have some higher claim to external validity, even if the “causal identification” is not so crisp.
In our paper, we demonstrate that the appearance of greater external validity in classical regression studies is an illusion. One may have data from a sample that is representative of a population of interest—call this the nominal sample. Setting aside issues of internal validity, if one uses this sample to estimate a causal effect via a multiple regression model, the units in the nominal sample will contribute to the effect estimate to differing extents. We can derive weights that measure the contribution from each member of the nominal sample. The weights are in expectation equal to the variances of the treatment variables conditional on the control variables. By reweighting the nominal sample with these multiple regression weights, we can characterize the effective sample that regression actually uses to estimate effects. The effective sample will typically differ from the nominal sample. The examples in our paper illustrate this point. Even though one starts with a sample representative of the population of interest, what one obtains is an effect for which there are still reasons to question generalizability to the population of interest. And this is, in some sense, a best-case scenario for classical regression. Without strong assumptions, there is no guarantee that classical regression will estimate any causal effect whatsoever.
So, classical regression produces “local” effect estimates, just like natural experimental methods and experiments on non-representative populations, but with the added limitation of questionable internal validity. Moreover, the conditions required to go from “local” to “generalized” effects cannot always be met in the real world. Does this doom us to a world of idiosyncratic knowledge? Not necessarily. We should be analyzing how effects vary along with features of effective samples, using this to inform theory. It may even be useful to have studies that look at highly unusual samples, to the extent that they provide special theoretical traction. Any such analysis requires that we accurately characterize the subpopulation that gives rise to the effects that we estimate.
Aronow is assistant professor, Department of Political Science, Yale University, and Samii is assistant professor, Department of Politics, New York University.