Table Of Content

Instruments, randomization, and learning about development Angus Deaton Research Program in Development Studies Center for Health and Wellbeing Princeton University March, 2010 This is a lightly revised and updated version of “Instruments of development: randomization in the tropics and the search for the elusive keys to economic development,” originally given as the Keynes Lecture, British Academy, October 9th, 2008, and published in the Proceedings. I am grateful to the British Academy for permission to reprint the unchanged material. I have clarified some points in response to Imbens (2009), but have not changed anything of substance, and stand by my original arguments. For helpful discussions and comments on earlier versions, I am grateful to Abhijit Banerjee, Tim Besley, Richard Blundell, Ron Brookmayer, David Card, Anne Case, Hank Farber, Bill Easterly, Erwin Diewert, Jean Drèze, Esther Duflo, Ernst Fehr, Don Green, Jim Heckman, Ori Heffetz, Karla Hoff, Bo Honoré, Štĕpán Jurajda, Dean Karlan, Aart Kraay, Michael Kremer, David Lee, Steve Levitt, Winston Lin, John List, David McKenzie, Margaret McMillan, Costas Meghir, Branko Milanovic, Chris Paxson, Franco Peracchi, Dani Rodrik, Sam Schulhofer-Wohl, Jonathan Skinner, Jesse Rothstein, Rob Sampson, Burt Singer, Finn Tarp, Alessandro Tarozzi, Chris Udry, Gerard van den Berg, Eric Verhoogen, Susan Watkins, Frank Wolak, and John Worrall. I acknowledge a particular intellectual debt to Nancy Cartwright, who has discussed these issues patiently with me for several years, whose own work on causality has greatly influenced me, and who pointed me towards other important work; to Jim Heckman, who has long thought deeply about these issues, and many of whose views are recounted here; and to the late David Freedman, who consistently and effectively fought against the (mis)use of technique as a substitute for substance and thought. None of which removes the need for the usual disclaimer, that the views expressed here are entirely my own. I acknowledge financial support from NIA through grant P01 AG05842–14 to the National Bureau of Economic Research. Instruments, randomization, and learning about development Angus Deaton ABSTRACT There is currently much debate about the effectiveness of foreign aid and about what kind of projects can engender economic development. There is skepticism about the ability of econometric analysis to resolve these issues, or of development agencies to learn from their own experience. In response, there is increasing use in development economics of randomized controlled trials (RCTs) to accumulate credible knowledge of what works, without over-reliance on questionable theory or statistical methods. When RCTs are not possible, the proponents of these methods advocate quasi-randomization through instrumental variable (IV) techniques or natural experiments. I argue that many of these applications are unlikely to recover quantities that are useful for policy or understanding: two key issues are the misunderstanding of exogeneity, and the handling of heterogeneity. I illustrate from the literature on aid and growth. Actual randomization faces similar problems as does quasi-randomization, notwithstanding rhetoric to the contrary. I argue that experiments have no special ability to produce more credible knowledge than other methods, and that actual experiments are frequently subject to practical problems that undermine any claims to statistical or epistemic superiority. I illustrate using prominent experiments in development and elsewhere. As with IV methods, RCT-based evaluation of projects, without guidance from an understanding of underlying mechanisms, is unlikely to lead to scientific progress in the understanding of economic development. I welcome recent trends in development experimentation away from the evaluation of projects and towards the evaluation of theoretical mechanisms. KEYWORDS: Randomized controlled trials, mechanisms, instrumental variables, development, foreign aid, growth, poverty reduction 1. Introduction The effectiveness of development assistance is a topic of great public interest. Much of the public debate among non-economists takes it for granted that, if the funds were made available, poverty would be eliminated, Pogge (2005), Singer (2004), and at least some economists agree, Sachs (2005, 2008). Others, most notably Easterly (2006), (2009), are deeply skeptical, a position that has been forcefully argued at least since Bauer (1971, 1981). Few academic economists or political scientists agree with Sachs’ views, but there is a wide range of intermediate positions, well assembled by Easterly (2008). The debate runs the gamut from the macro—can foreign assistance raise growth rates and eliminate poverty?—to the micro—what sorts of projects are likely to be effective?—should aid focus on electricity and roads, or on the provision of schools and clinics or vaccination campaigns? Here I shall be concerned with both the macro and micro kinds of assistance. I shall have very little to say about what actually works and what does not; but it is clear from the literature that we do not know. Instead, my main concern is with how we should go about finding out whether and how assistance works and with methods for gathering evidence and learning from it in a scientific way that has some hope of leading to the progressive accumulation of useful knowledge about development. I am not an econometrician, but I believe that econometric methodology needs to be assessed, not only by methodologists, but by those who are concerned with the substance of the issue. Only they (we) are in a position to tell when something has gone wrong with the application of econometric methods, not because they are incorrect given their assumptions, but because their assumptions do not apply, or because they are incorrectly 1 conceived for the problem at hand. Or at least that is my excuse for meddling in these matters. Any analysis of the extent to which foreign aid has increased economic growth in recipient countries immediately confronts the familiar problem of simultaneous causality; the effect of aid on growth, if any, will be disguised by effects running in the opposite direction, from poor economic performance to compensatory or humanitarian aid. It is not obvious how to disentangle these effects, and some have argued that the question is unanswerable and that econometric studies of it should be abandoned. Certainly, the econometric studies that use international evidence to examine aid effectiveness currently have low professional status. Yet it cannot be right to give up on the issue. There is no general or public understanding that nothing can be said, and to give up the econometric analysis is simply to abandon precise statements for loose and unconstrained histories of episodes selected to support the position of the speaker. The analysis of aid effectiveness typically uses cross-country growth regressions with the simultaneity between aid and growth dealt with using instrumental variable methods. I shall argue in the next section that there has been a good deal of misunderstanding in the literature about the use of instrumental variables. Econometric analysis has changed its focus over the years, away from the analysis of models derived from theory towards much looser specifications that are statistical representations of program evaluation. With this shift, instrumental variables have moved from being solutions to a well-defined problem of inference to being devices that induce quasi-randomization. Old and new understandings of instruments co-exist, leading to errors, misunderstandings and confusion, as well as unfortunate and unnecessary rhetorical barriers between disciplines 2 working on the same problems. These abuses of technique have contributed to a general skepticism about the ability of econometric analysis to answer these big questions. A similar state of affairs exists in the microeconomic area, in the analysis of the effectiveness of individual programs and projects, such as the construction of infrastructure—dams, roads, water supply, electricity—and in the delivery of services— education, health or policing. There is frustration with aid organizations, particularly the World Bank, for allegedly failing to learn from its projects and to build up a systematic catalog of what works and what does not. In addition, some of the skepticism about macro econometrics extends to micro econometrics, so that there has been a movement away from such methods and towards randomized controlled trials. According to Esther Duflo, one of the leaders of the new movement in development, “Creating a culture in which rigorous randomized evaluations are promoted, encouraged, and financed has the potential to revolutionize social policy during the 21st century, just as randomized trials revolutionized medicine during the 20th,” this from a 2004 Lancet editorial headed “The World Bank is finally embracing science.” In Section 4 of this paper, I shall argue that under ideal circumstances, randomized evaluations of projects are useful for obtaining a convincing estimate of the average effect of a program or project. The price for this success is a focus that is too narrow and too local to tell us “what works” in development, to design policy, or to advance scientific knowledge about development processes. Project evaluations, whether using randomized controlled trials or non-experimental methods, are unlikely to disclose the secrets of development nor, unless they are guided by theory that is itself open to revision, are they likely to be the basis for a cumulative research program that might lead 3 to a better understanding of development. This argument applies a fortiori to instrumental variables strategies that are aimed at generating quasi-experiments; the value of econometric methods cannot and should not be assessed by how closely they approximate randomized controlled trials. Following Cartwright (2007a, b), I argue that evidence from randomized controlled trials can have no special priority. Randomization is not a gold standard because “there is no gold standard,” Cartwright (2007a.) Randomized controlled trials cannot automatically trump other evidence, they do not occupy any special place in some hierarchy of evidence, nor does it make sense to refer to them as “hard” while other methods are “soft”. These rhetorical devices are just that; metaphor is not argument, nor does endless repetition make it so. More positively, I shall argue that the analysis of projects needs to be refocused towards the investigation of potentially generalizable mechanisms that explain why and in what contexts projects can be expected to work. The best of the experimental work in development economics already does so, because its practitioners are too talented to be bound by their own methodological prescriptions. Yet there would be much to be said for doing so more openly. I concur with Pawson and Tilley (1997), who argue that thirty years of project evaluation in sociology, education and criminology was largely unsuccessful because it focused on whether projects worked instead of on why they worked. In economics, warnings along the same lines have been repeatedly given by James Heckman, see particularly Heckman (1992) and Heckman and Smith (1995), and much of what I have to say is a recapitulation of his arguments. The paper is organized as follows. Section 2 lays out some econometric preliminaries concerning instrumental variables and the vexed question of exogeneity. Section 3 is 4 about aid and growth. Section 4 is about randomized controlled trials. Section 5 is about using empirical evidence and where we should go now. 2. Instruments, identification, and the meaning of exogeneity It is useful to begin with a simple and familiar econometric model that I can use to illustrate the differences between different flavors of econometric practice; this has nothing to do with economic development, but it is simple and easy to contrast with the development practice that I wish to discuss. In contrast to the models that I will discuss later, I think of this as a model in the spirit of the Cowles Foundation. It is the simplest possible Keynesian macroeconomic model of national income determination taken from once-standard econometrics textbooks. There are two equations which together comprise a complete macroeconomic system. The first equation is a consumption function, in which aggregate consumption is a linear function of aggregate national income, while the second is the national income accounting identity that says that income is the sum of consumption and investment. I write the system in standard notation as C Y u (1) Y CI (2) According to (1), consumers choose the level of aggregate consumption with reference to their income, while in (2), investment is set by the “animal spirits” of entrepreneurs in some way that is outside of the model. No modern macroeconomist would take this model seriously, though the simple consumption function is an ancestor of more satisfactory and complete modern formulations; in particular, we can think of it (or at least its descendents) as being derived from a coherent model of intertemporal choice. 5 Similarly, modern versions would postulate some theory for what determines investment I; here it is simply taken as given, and assumed to be orthogonal to the consumption disturbance u. In this model, consumption and income are simultaneously determined so that, in particular, a stochastic realization of u—consumers displaying animal spirits of their own—will affect not only C, but also Y through equation (2), so that there is a positive correlation between u and Y. As a result, ordinary least squares estimation of (1) will lead to upwardly biased and inconsistent estimates of the parameter. This simultaneity problem can be dealt with in a number of ways. One is to solve (1) and (2), to get the reduced form equations   u C   I  (3) 1 1 1  I u Y    (4) 1 1 1 Both of these equations can be consistently estimated by OLS, and it is easy to show that the same estimates of  and  will be obtained from either one. An alternative method of estimation is to focus on the consumption function (1), and to use our knowledge of (2) to note that investment can be used as an instrumental variable (IV) for income. In the IV regression, there is a “first stage” regression in which income is regressed on investment; this is identical to equation (4), which is part of the reduced form. In the second stage, consumption is regressed on the predicted value of income from (4). In this simple case, the IV estimate of  is identical to the estimate from the reduced form. This simple model may not be a very good model, but it is a model, if only a primitive one. 6 I now leap forward sixty years, and consider an apparently similar set up, again using an absurdly simple specification. The World Bank (let us imagine) is interested in whether to advise the government of China to build more railway stations as part of its poverty reduction strategy. The Bank economists write down an econometric model in which the poverty head count ratio in city c is taken to be a linear function of an indicator R of whether or not the city has a railway station, P R v (5) c c c where  (I hesitate to call it a parameter) indicates the effect—presumably negative—of infrastructure (here a railway station) on poverty. While we cannot expect to get useful estimates of  from OLS estimation of (5)—railway stations may be built to serve more prosperous cities, they are rarely built in deserts where there are no people, or there may be “third factors” that influence both—this is seen as a “technical problem” for which there is a wide range of econometric treatments including, of course, instrumental variables. We no longer have the reduced form of the previous model to guide us but if we can find an instrument Z that is correlated with whether a town has a railway station, but uncorrelated with v, we can do the same calculations and obtain a consistent estimate. For the record, I write this equation R Z  (6) c c c Good candidates for Z might be indicators of whether the city has been designated by the Government of China as belonging to a special “infrastructure development area,” or perhaps an earthquake that conveniently destroyed a selection of railway stations, or even the existence of river confluence near the city, since rivers were an early source of power, 7 and railways served the power-based industries. I am making fun, but not much. And these instruments all have the real merit that there is some mechanism linking them to whether or not the town has a railway station, something that is not automatically guaranteed by the instrument being correlated with R and uncorrelated with v, see for example Reiss and Wolak (2007, pp 4296–8) My main argument is that the two econometric structures, in spite of their resemblance and the fact that IV techniques can be used for both, are in fact quite different. In particular, the IV procedures that work for the effect of national income on consumption are unlikely to give useful results for the effect of railway stations on poverty. To explain the differences, I begin with the language. In the original example, the reduced form is a fully specified system, since it is derived from a notionally complete model of the determination of income. Consumption and income are treated symmetrically, and appear as such in the reduced form equations (3) and (4). In contemporary examples, such as the railways, there is no complete theoretical system, and there is no symmetry. Instead, we have a “main” equation (5), which used to be the “structural” equation (1). We also have a “first-stage” equation, which is the regression of railway stations on the instrument. The now rarely considered regression of the variable of interest on the instrument, here of poverty on earthquakes or on river confluences, is nowadays referred to as the reduced form, although it was originally one equation of a multiple equation reduced form—equation (6) is also part of the reduced form—within which it had no special significance. These language shifts sometimes cause confusion, but they are not the most important differences between the two systems. 8

Description:

March, 2010. This is a lightly revised and updated version of “Instruments of development: randomization in various country dummies, for example, a dummy for Egypt, or for francophone West. Africa. Experiments, Cox (1958, p.15) begins his book with the assumption that the treatment effects are

Instruments, randomization, and learning about development Angus Deaton Research Program in ... PDF

66 Pages·2010·0.23 MB·English

Checking for file health...

Save to my drive

Quick download

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Instruments, randomization, and learning about development Angus Deaton Research Program in ...

Description:

See more

The list of books you might like

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.