In a thought-provoking paper, Josh Angrist and Steve Pischke describe the credibility revolution that is currently going on in economics. Having grown in the Haavelmo-Cowles-Heckman tradition of structural econometrics, I have to admit that I resisted the intuitive attraction that this paper had on me. But the more I think about it, the more I can see all that is correct in the view that Josh and Steve defend in their paper, and the more I see myself adapting this view to my own everyday research, and the more I find myself happy about it. The credibility revolution makes a lot of sense to me since I can relate it to the way I was taught biology and physics, and the reasons why I loved these sciences: for their convincing empirical background. I admittedly have my own interpretation of the credibility revolution, that does not fully overlap with that of Josh and Steve. I am going to try to make it clear in what follows.
To me, the credibility revolution means that data and empirical validation are as important as sound and coherent theories. It means that I cannot accept a theoretical proposition unless I have access to repeated tests that it is not rejected in the data. It also means that I do not use tools that have not proven repeatedly that they work.
Let me give three examples in economics. In economics as a behavioral science, a very important tool to model the behavior of agents under uncertainty is the expected utility framework that dates back at least to Bernoulli, who introduced it to solve the Saint Petersburg paradox. von Neumann and Morgenstern have shown that this framework could be rationalized by some simple axioms of behavior. Allais, in a very famous experiment, tested the implication of one of these axioms. What he found was that people consistently rejected this axiom. This results has been reproduced many times since then. This means that the expected utility framework as a scientific description of how people behave has been refuted. This lead to the development of other axioms and other models of behavior under uncertainty, the most famous being Kahneman and Tversky's prospect theory. This does not mean that the expected utility framework is useless for engineering purposes. We seem to have good empirical evidence that it is approximately correct in a lot of situations (readers, feel free to leave references on this type of evidence in the comments). It might be more simple to use it rather than the more complex competing models of behavior that have been proposed since. The only criteria on which we should judge its performance as an engineering tool is by its ability to predict actual choices. We are seeing more and more of this type of crucial tests of our theories, and this is for the best. I think we should emphasize these empirical results in our teaching of economics: they are as important as the underlying theory that they test.
The second example is in economics as engineering: McFadden's random utility model. McFadden used the utility maximization framework to model people's choices of their transportation mode. He modeled the choice of using your car, the bus, your bike or walking as depending on the characteristics of the travels (time to go to work) and your intrinsic preferences for one mode or the other. He estimated the preferences on a dataset of individuals in the San Francisco bay area in 1972. He then used his model to predict what would happen when an additional mode of transportation would be proposed (the subway, or BART). Based on his estimates, he predicted that the market share of the subway would be 6.3%, well below the engineering estimates of the time that rounded around 15%. When the subway opened in 1976, its market share soon reached 6.2% and stabilized there. This is one of the most beautiful and convincing example of testing of an engineering tool in economics. Actually, this amazing performance decided transportation researchers to abandon their old engineering models and use McFadden's. I think it is for this success than Dan was eventually awarded the Nobel prize in economics. We see more and more of this type of tests of structural models, and this is for the best.
The third example is in economics, or rather behavioral, engineering (when I use the term "behavioral," I encompass all the sciences that try to understand man's behavior). From psychology, and increasingly economics, we know that cognitive and non-cognitive (or socio-emotional) skills are malleable all along an individual's lifetime. We believe that it is possible to design interventions that help kids acquire these skills. But one still has to prove that these interventions actually work. That's why psychologists, and more recently economists, use randomized experiments to check whether these interventions actually work. In practice, they randomly select among a group of children the one that are going to receive the intervention (the treatment group) and the ones that are going to stay in the business as usual scenario (the control group). By comparing the outcomes of the treatment anc control group, we can infer the effect of the intervention free of any source of bias since both groups are initially identical thanks to the randomization This is exactly what doctors do to evaluate the effects of drugs. Jim Heckman and Tim Kautz summarize the evidence that we have so far on these experiments. The most famous one is the Perry preschool program, that followed the kids until their forties. The most fascinating finding of this experiment is that by providing a nurturing environment during the early years of the kids' lives (from 3 to 6), the Perry project has been able to change durably the kids' lives. The surprising result is that this change has not been triggered by a change in cognitive skills, but only by a change in non-cognitive skills. This impressive evidence has directed a lot of attention to childhood programs and to the role of non-cognitive skills. Jim Heckman is one of the most ardent proponents of this approach in economics.
The credibility revolution makes sense to me also because of the limitations of Haavelmo's framework. As I already said, trying to infer stable autonomous laws from observational data is impossible, since there is not enough free variation in this data. There are too many unknowns and not enough observations to recover each of them. Haavelmo was well-aware of this problem, but the solution that he and the Cowles Commission advocated-using a priori restrictions to restore identification-was doomed to fail. What we need to learn something about how our theories and our engineering models perform is not a priori restrictions on how the world behaves, but more free and independent information about how the world works. This is basically what Josh's argument is about: think about these restrictions as to make them as convincing as experiments. That's why Josh coined the term natural experiments: the variation in the observed data that we use should be as good as an experiment, not stemming from theory but from luck: the world has offered us some free variation and we can use it to recover something about its deeper relationships.
The problem with the natural experiment approach is that whether we have identified free variation and whether it really can be used to discriminate among theories is highly debatable. Sometimes, we cannot do better, and we have to try to prove that the natural variation is as good as an experiment. But, a lot of the times, we can think of a way of generating free variation ourselves by building a field experiment. And this is exactly what is happening today in economics. All these experiments (or RCTs: Randomized Control Trials) that we see in the field are just ways of generating free variation, with several purposes in mind: testing policies, testing the prediction accuracy of models, testing scientific theories. Some experiments can do several of these things at the same time.
This is an exciting time to do economics. I will post in the future on other early engineering and scientific tests, and I will report on my own and others' research that I find exciting.