ebook img

ERIC ED605694: Methodological Approaches for Impact Evaluation in Educational Settings PDF

2020·0.28 MB·English
by  ERIC
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview ERIC ED605694: Methodological Approaches for Impact Evaluation in Educational Settings

Anglin, K. L., Krishnamachari, A., & Wong, V. (2020). Methodological Approaches for Impact Evaluation in Educational Settings. Oxford Bibliographies. https://doi.org/10.1093/obo/9780199756810-0244 Methodological Approaches for Impact Evaluation in Educational Settings INTRODUCTION Since the start of the War on Poverty in the 1960s, social scientists have developed and refined experimental and quasi-experimental methods for evaluating and understanding the ways in which public policies, programs, and interventions affect people’s lives. The overarching mission of many social scientists is to understand “what works” in education and social policy. These are causal questions about whether an intervention, practice, program, or policy affects some outcome of interest. Although causal questions are not the only relevant questions in program evaluation, they are assumed by many in the fields of public health, economics, social policy, and now education to be the scientific foundation for evidence-based decision making. Fortunately, over the last half-century, two methodological advances have improved the rigor of social science approaches for making causal inferences. The first was acknowledging the primacy of research designs over statistical adjustment procedures. Donald Campbell and colleagues showed how research designs could be used to address many plausible threats to validity. The second methodological advancement was the use of potential outcomes to specify exact causal quantities of interest. This allowed researchers to think systematically about research design assumptions and to develop diagnostic measures for assessing when these assumptions are met. This article reviews important statistical methods for estimating the impact of interventions on outcomes in education settings, particularly programs that are implemented in field, rather than laboratory, settings. We begin by describing the causal inference challenge for evaluating program effects. Then four research designs are discussed that may be used for estimating program impacts. The article highlights what the Campbell tradition identifies as the The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant #R305B140026 to the Rectors and Visitors of the University of Virginia. The opinions expressed are those of the authors and do not represent views of the Institute or the U.S. Department of Education. strongest causal research designs: the randomized experiment and the regression-discontinuity designs. These approaches have the advantage of transparent assumptions for yielding causal effects. The article then discusses weaker but more commonly used approaches estimating effects, including the interrupted time series and the non-equivalent comparison group designs. For the interrupted time series design, differences-in-differences are discussed as a more generalized approach to time series methods; for non-equivalent comparison group designs, the article highlights propensity score matching as a method for creating statistically equivalent groups on the basis of observed covariates. For each research design, references are included that discuss the underlying theory and logic of the method, exemplars of the approach in field settings, and recent methodological extensions to the design. The article concludes with a discussion of practical considerations for evaluating interventions in field settings, including the external validity of estimated effects from impact studies. GENERAL OVERVIEWS The fundamental problem of causal inference is that we cannot observe both what happens to a student when they receive an intervention and what would have occurred in an alternate reality in which the same student did not receive an intervention. For example, researchers can observe what happens to children in a preschool program but cannot observe what would have happened to the same children had they not entered preschool. To study the causal effect of a program or intervention, one needs a counterfactual, or something that is contrary to fact. Given that researchers never observe the counterfactual, we look for approximations (e.g., older siblings, neighborhood children, children in a nationally representative survey, or randomly assigned control children not exposed to the treatment). The Rubin Causal Model introduced in Rubin 1974 formalizes this reasoning mathematically. It is based on the idea that every unit has a potential outcome based on its “assignment” to a treatment or control condition. Using a potential outcomes framework, researchers are able to define a causal estimand of interest for a well-defined treatment and inference population, as well as assumptions required for a research design to yield a valid effect. Campbell and Stanley 1963 demonstrates how these assumptions may be violated in field settings through their list of “validity threats.” Cook and Campbell 1979 and Shadish et al. 2002 extend this idea by introducing four types of validity threats, including threats to internal, external, statistical conclusion, and construct validity. Angrist and Pischke 2009 provides an up-to-date overview of common methodological approaches from an econometric perspective and discusses estimation procedures for producing causal estimates. Angrist and Pischke 2015 offers a more approachable overview of the same material intended for an undergraduate audience. Imbens and Rubin 2015 and Morgan and Winship 2007 straddle the econometric and statistics literature and offer additional insights about causal inference from a potential outcomes perspective and a causal graph theory perspective, respectively. For an overview of key experimental and quasi-experimental designs specific to the field of education, see Murnane and Willett 2011 and Stuart 2007. Angrist, J., and J.-S. Pischke. 2009. Mostly harmless econometrics: An empiricist’s companion. Princeton, NJ: Princeton Univ. Press. [ISBN: 9780691120348][class:book] This book is a reference on methods of causal inference using a potential outcomes framework. It covers randomized experiments, statistical matching, instrumental variables, difference-in- differences, and regression discontinuity. The book describes each design and its assumptions formally through a series of proofs and informally through applied examples. Though written for a graduate student audience, it is a useful resource for any evaluator with training in probability and statistics. Angrist, J., and J. -S. Pischke. 2015. Mastering ’metrics: The path from cause to effect. Princeton, NJ: Princeton Univ. Press. [ISBN: 9780691152837][class:book] Angrist and Pischke 2015 provides a more approachable and conversational companion to Angrist and Pischke 2009. While both books describe the same methods of causal inference (randomized control trials, statistical matching, instrumental variables, regression discontinuity, and differences-in-differences designs), this book focuses more on conceptual understanding than on formal proofs—though brief proofs are provided. The book is written as an introduction to causal inference for undergraduate economics students. Campbell, D. T., and J. C. Stanley. 1963. Experimental and quasi-experimental design for research. Boston, MA: Houghton Mifflin.[class:book] This seminal book outlines the major threats to internal validity (Did the intervention cause the observed effect?) and external validity (To what population, settings, treatments, and outcomes can this effect be generalized?) and provides an overview of how design features can address these threats. While the book discusses quasi-experimental designs, it is best suited for an overview of conceptual challenges related to causal inference rather than for guidance in statistical methods in estimating effects. Cook, T. D., and D. T. Campbell. 1979. Quasi-experimentation: Design and analysis issues for field settings. Boston, MA: Houghton Mifflin. [ISBN: 9780395307908][class:book] Similar to Campbell and Stanley 1963, the first chapters of this book introduce the challenge of causal inference and threats to validity. The book updates Campbell and Stanley 1963 by also addressing analytical approaches. Helpfully, the book concludes with a section outlining major obstacles to conducting randomized experiments and describing situations that are particularly conducive to experimental evaluation. Imbens, G., and D. Rubin. 2015. Causal inference for statistics, social, and biomedical sciences: An introduction. Cambridge, UK: Cambridge Univ. Press. [ISBN: 9780521885881][class:book] This textbook provides a rigorous introduction to the potential outcomes framework. Because the book relies on formal mathematical derivations, it is most appropriate for those with a solid understanding of probability and statistics. The book discusses randomized experiments (including instrumental variables for non-compliance) and matching methods but does not provide an overview of quasi-experimental designs. Applied examples from education, social science, and biomedical science are used to illustrate concepts. Morgan, S., and C. Winship. 2007. Counterfactuals and causal inference: Methods and principles for social research. Cambridge, UK: Cambridge Univ. Press. [ISBN: 9780521856157][class:book] This textbook discusses how to answer causal questions using observational data rather than data where researchers have the opportunity to manipulate the treatment assignment. The book discusses randomized experiments primarily as a starting point to further understanding on non-experimental research designs, but several concepts, including the potential outcomes framework, are explained in detail with the help of causal diagrams, structural models, and examples from the social sciences. Murnane, R., and J. Willett. 2011. Methods matter: Improving causal inference in educational and social science research. New York: Oxford Univ. Press. [ISBN: 9780199753864][class:book] This book is a broadly accessible reference to causal inference in education research. It illustrates important concepts in the design and analysis of randomized experiments, quasi- experiments (including the difference-in-difference, regression discontinuity, and instrumental variables approaches), and observational studies. High-quality causal studies in the field of education are used to demonstrate and evaluate the decisions researchers make in the design and analysis of a study. Rubin, D. B. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66.5: 688–701. [doi:10.1037/h0037350] Provides the fundamental building blocks for modern program evaluation. Rubin conceptualizes the fundamental challenge of causal inference using a series of potential outcomes—individual outcomes in the presence of treatment and in the absence of treatment. This conceptualization allows for the formalization of both experimental and non-experimental design assumptions and is often referred to as the Rubin causal model. Shadish, W. R., T. D. Cook, and D. T. Campbell. 2002. Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin. [ISBN: 9780395615560][class:book] This book is a successor to Campbell and Stanley 1963 and Cook and Campbell 1979. Provides a comprehensive discussion of the design elements a researcher may include to improve internal validity and provides the conceptual theory for research design choices. The latter part of the book proposes a theoretical framework for generalized causal inference. Stuart, E. A. 2007. Estimating causal effects using school-level data sets. Educational Researcher 36.4: 187–198. [doi:10.3102/0013189X07303396][class:journalArticle] Stuart provides a survey of evaluation approaches with school-level data, including randomized experiments, regression discontinuity, interrupted time series, and non-equivalent comparison group designs. The article provides an overview of the National Longitudinal School-Level State Assessment School Database (NLSLASD) and key considerations to keep in mind when using the NLSLASD or other school-level datasets to answer causal questions. RANDOMIZED CONTROL TRIALS The most credible evaluations use random assignment to determine access to an intervention. The modern design of randomized experiments can be attributed to Fisher 1935. In a randomized experiment, researchers assign participants to a “treatment” or “control” group using a deliberately random procedure such as a coin toss. The treatment group participates in some program or intervention while the control group does not. Assuming a large enough sample, the random assignment procedure creates two or more groups that are equivalent on average for all baseline characteristics and potential outcomes. When this happens, the evaluator may estimate program impacts by comparing the average outcomes in the treatment and control groups and interpreting the difference in the two means as the average treatment effect (ATE) in the study population. The random assignment procedure helps to ensure that differences in outcomes between the treatment and control groups are due to the treatment or policy under investigation, and not some unobserved factors related to both treatment assignment and the outcome. Over the last twenty years, there have been increasing calls for experimental evaluations of treatments, programs, and policies in education settings. Cook 2002 and Mosteller and Boruch 2002 offer arguments for conducting randomized control trials in field settings and provide advice for addressing the political and moral considerations that may arise. Beyond even political and moral concerns, randomized control trials can be challenging to implement in field settings. Problems include randomization failure, interference between units, attrition, treatment noncompliance, and missing data. For comprehensive guides on addressing (and preventing) these challenges, we recommend Gerber and Green 2012 and Duflo, et al. 2005. Barnard, et al. 2003 also offers additional insights into addressing treatment non-compliance and missing data. Finally, Angrist, et al. 1996 discusses the use of instrumental variables (IV) to answer causal research questions when there is treatment non-compliance and Gennetian 2002 uses an IV approach to identify effects of intervening variables (or mediators) with the aim of improving experimental design and informing policy decisions. Angrist, J. D., G. W. Imbens, and D. B. Rubin. 1996. Identification of causal effects using instrumental variables. Journal of the American statistical Association 91.434: 444– 455.[class:journalArticle] This paper outlines the use of IV to estimate the treatment effect on treated individuals in the case of treatment non-compliance. The use of IV is formulated using the Rubin Causal model, and the authors outline the identifying assumptions required to identify treatment-on-the- treated effects. Barnard, J., C. E. Frangakis, J. L. Hill, and D. B. Rubin. 2003. Principal stratification approach to broken randomized experiments: A case study of school choice vouchers in New York City. Journal of the American Statistical Association 98.462: 299–323.[class:journalArticle] This article discusses the benefits of implementing a randomized experiment and outlines potential complications in experiments that involve human subjects. These include missing background and outcome data and noncompliance with randomly assigned treatment. The article details and addresses these complications using a principal stratification framework. Cook, T. D. 2002. Randomized experiments in educational policy research: A critical examination of the reasons the educational evaluation community has offered for not doing them. Educational Evaluation and Policy Analysis 24.3: 175–199. [doi:10.3102/01623737024003175][class:journalArticle] Despite the widespread belief that experiments provide the best warrant for causal claims, experiments have only recently started making their way into schools and classrooms. In this article, Cook discusses five common critiques of experiments and provides concrete examples of how experiments may be designed to counteract these concerns. Duflo, E., R. Glennerster, and M. Kremer. 2007. *Using randomization in development economics research: A toolkit Handbook of development economics, 4, 3895-3962.[ doi:10.1016/s1573-4471(07)04061-2][ class:journalArticle] Duflo and colleagues provide an in-depth “toolkit” for practitioners and researchers who are interested in implementing randomized field experiments. The paper explains why randomized experiments are considered the best design to answer causal research questions, examines the conditions under which random assignments yields such causal claims, and discusses implementation procedures for successful studies. Fisher, R. A. 1935. The design of experiments. Oxford, UK: Oliver & Boyd.[class:book] As the first introduction to null hypothesis testing, Fisher’s Design of Experiments is considered a foundational work in experimental design. The book discusses several types of experimental designs and shows how conclusions can be drawn from such designs by formulating and disproving null hypotheses. Gennetian, L. A., J. M. Bos, and P. A. Morris. 2002. *Using instrumental variables analysis to learn more from social policy experiments[https://www.mdrc.org/sites/default/files/full_599.pdf]*. MDRC Working Papers on Research Methodology. New York, NY: MDRC.[class:report] This report discusses the use of IV for examining causal claims. In their report, the authors explore the feasibility of applying IV strategies to data from experimental designs, review policy questions that can be answered, and examine necessary conditions for estimating mediating effects. Provides guidance on the use of IV to design more effective interventions and inform broader policy decisions. Gerber, A. S., and D. P. Green. 2012. Field experiments: Design, analysis, and interpretation. New York: W. W. Norton. [ISBN: 9780393979954][class:book] This is an introductory textbook on field experiments in the social sciences and covers major topics in the design, implementation, and analysis of experiments in field settings. Readers also learn how to handle common implementation challenges that arise in field experiments, including treatment non-compliance, violations to participant non-interference assumptions, and missing data. Overall, this is a great resource for new researchers to familiarize themselves with the “how-to” of experiments. Mosteller, F., and R. F. Boruch. 2002. Evidence matters: Randomized trials in education research. Washington, DC: Brookings Institute Press. [ISBN: 9780815702054][class:book] In this edited volume, authors discuss the necessity of experiments, theorize reasons for their relative absence in education compared to other fields, and offer advice in addressing the political and moral challenges of conducting randomized experiments in education. Cook 2002 and this volume together provide a comprehensive overview of the status of experiments in education and the reasons they are sparsely implemented in educational settings. REGRESSION DISCONTINUITY DESIGN In a regression-discontinuity design, participants are assigned to treatment and comparison groups on the basis of a cutoff score from a quantitative assignment variable (also called a “running variable”). Here, individuals who score above the cutoff are assigned to the treatment (or control), while individuals who score below the cutoff are assigned to the control (or

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.