How do you learn about cause and effect without randomized experiments? We provide a dramatically improved method of inferring causality even when randomization is impossible. We also invented a computational approach that makes our technique approximately one million times faster than previous approaches, meaning it will usually take only a few minutes to run for most commonly sized data sets, and opening up new ways of conducting empirical analyses. The result is an easy-to-understand statistical technique, along with free and easy-to-use open source software, that researchers can use to learn about cause and effect without experiments. Correlation is still not causation, but we show how observational correlations, when used appropriately, can teach us a great deal about causation.
Although randomization is obviously impossible for numerous crucial questions in political science, the social sciences, the sciences, medicine, education, government, public policy, and business, we often have no choice but to draw some type of causal conclusion in these areas. For example, no one has ever conducted a randomized experiment on tobacco consumption in humans, but we know that smoking kills half a million people a year in the US alone. Similarly, on one has ever randomized incumbency status to members of congress, and yet we have learned that voters are more likely to cast their ballots for incumbents than challengers. Much hard work has gone into solving these problems; our article offers a way to make such generalizations easier, faster, and more reliably.
Our approach builds on a highly intuitive technique called matching. The idea is sophisticated, but the basics are simple. Suppose we want to know whether renovating school buildings improves educational outcomes. As generous as NSF grants are, no researcher will be able to randomly assign new buildings to school districts. So instead, we can compare school districts which choose to build new buildings with those that do not, but we will have to avoid the bias that would occur if those districts differ in other ways. For example, districts with new buildings might also be areas with lots of wealth and parental involvement in their children’s education. So a simple version of matching in this case would be to find a sample of school districts which recently built new buildings. Then for each one, find a very similar (“matched”) district that chose not to renovate its buildings. We can define “very similar” according to wealth, parental involvement, the conditions of the school buildings, teaching styles, and many other factors. If we can find these matched pairs, then we might be able to approximate a randomized experiment from purely observational data.
In practice, no two school districts are exactly the same, and so a large number of matching methods have been developed, each which works best in different circumstances at making the treated group (the one with the new schools) as similar as possible to the control group (those without the new schools). In contrast, our American Journal of Political Science article, now available here for early view, introduces new methods for matching that result in the optimal matched sample, without having to choose a matching method.
In particular, every matching method begins with a data set and prunes non-matches in order to try to improve the similarity between the remaining data in the treated and control groups. For any number of pruned observations, there exists a subsample of the original data that has highest level of similarity, but the number of possible subsets that need to be checked to find the optimal one is gargantuan (usually larger than the number of elementary particles in the universe) and so previous matching methods try to approximate this optimum. Our algorithm quickly identifies the best possible subset, without running any matching algorithm. Given these criteria, we have proven mathematically that no matching method in existence, or which could be invented, can produce higher levels of similarity between the treated and control groups than our algorithm. The algorithm is so fast that we are able to recommend computing it for all possible numbers of observations pruned and presenting the full results for the analyst to choose, without tradeoffs or compromises. Please see our software at http://j.mp/MatchingFrontier.
Written by Gary King of Harvard University, Christopher Lucas of Harvard University, and Richard Nielsen of Massachusetts Institute of Technology. Their article “The Balance-Sample Size Frontier in Matching Methods for Causal Inference” will be published in a forthcoming issue of the American Journal of Political Science and is currently available for Early View.