How to See More in Observational Studies: Some New Quasi-Experimental Devices

Paul R. Rosenbaum

doi:10.1146/annurev-statistics-010814-020201

Annual Review of Statistics and Its Application

Volume 2, 2015

Review Article

Free

How to See More in Observational Studies: Some New Quasi-Experimental Devices

Paul R. Rosenbaum¹
View Affiliations Hide Affiliations

Affiliations: Department of Statistics, Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104; email: [email protected]
Vol. 2:21-48 (Volume publication date May 2015) https://doi.org/10.1146/annurev-statistics-010814-020201
First published as a Review in Advance on November 06, 2014
© Annual Reviews

Abstract

In a well-conducted, slightly idealized, randomized experiment, the only explanation of an association between treatment and outcome is an effect caused by the treatment. However, this is not true in observational studies of treatment effects, in which treatment and outcomes may be associated because of some bias in the assignment of treatments to individuals. When added to the design of an observational study, quasi-experimental devices investigate empirically a particular rival explanation or counterclaim, often attempting to preempt anticipated counterclaims. This review has three parts: a discussion of the often misunderstood logic of quasi-experimental devices; a brief overview of the important work of Donald T. Campbell and his colleagues (excellent expositions of this work have been published elsewhere); and its main topic, descriptions and empirical examples of newer devices, including evidence factors, differential effects, and the computerized construction of quasi-experiments.

Keyword(s): differential effects, evidence factors, multiple control groups, sensitivity analysis, strengthening an instrumental variable

Article metrics loading...

/content/journals/10.1146/annurev-statistics-010814-020201

2015-04-10

2024-05-12

Full text loading...

/deliver/fulltext/statistics/2/1/annurev-statistics-010814-020201.html?itemId=/content/journals/10.1146/annurev-statistics-010814-020201&mimeType=html&fmt=ahah

Literature Cited

Alam K. 1974. Some nonparametric tests of randomness. J. Am. Stat. Assoc. 69:738–39 [Google Scholar]
Angrist JD, Imbens GW, Rubin DB. 1996. Identification of causal effects using instrumental variables. J. Am. Stat. Assoc. 91:444–55; discussion455–68 [Google Scholar]
Anthony JC, Breitner JC, Zandi PP, Meyer MR, Jurasova I. et al. 2000. Reduced prevalence of AD in users of NSAIDs and H2 receptor antagonists. Neurology 54:2066–71 [Google Scholar]
Baiocchi M, Small DS, Lorch S, Rosenbaum PR. 2010. Building a stronger instrument in an observational study of perinatal care for premature infants. J. Am. Stat. Assoc. 105:1285–96 [Google Scholar]
Basu AP. 1983. Identifiability. Encyclopedia of Statistical Sciences 42 New York: Wiley [Google Scholar]
Blank SV, Curtin JP. 2007. More than a name. J. Clin. Oncol. 25:3551 [Google Scholar]
Bound J, Jaeger DA, Baker RM. 1995. Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. J. Am. Stat. Assoc. 90:443–50 [Google Scholar]
Brien CJ, Bailey RA. 2006. Multiple randomizations. J. R. Stat. Soc. A 68:571–99; discussion 599–609 [Google Scholar]
Campbell DT. 1957. Factors relevant to the validity of experiments in social settings. Psychol. Bull. 54:297–312 Reprinted in Campbell 1988 [Google Scholar]
Campbell DT. 1969. Prospective: artifact and control. Artifact in Behavioral Research R Rosenthal, R Rosnow 351–82 New York: Academic Press Reprinted in Campbell 1988 [Google Scholar]
Campbell DT. 1988. Methodology and Epistemology for Social Science: Selected Papers Chicago: Univ. Chicago Press
Campbell DT, Boruch RF. 1975. Making the case for randomized assignment to treatments by considering the alternatives: Six ways in which quasi-experimental evaluations in compensatory education tend to underestimate effects. Evaluation and Experiment: Some Critical Issues in Assessing Social Programs CA Bennett, AA Lumsdaine 195–296 New York: Academic [Google Scholar]
Campbell DT, Stanley JC. 1963. Experimental and Quasi-Experimental Designs for Research Chicago: Rand McNally
Cannistra SA. 2007. Gynecologic oncology or medical oncology: What's in a name?. J. Clin. Oncol. 25:1157–59 [Google Scholar]
Cochran WG. 1965. The planning of observational studies of human populations. J. R. Stat. Soc. A 128:234–65 [Google Scholar]
Cook TD, Campbell DT. 1979. Quasi-Experimentation Boston: Houghton Mifflin
Cornfield J, Haenszel W, Hammond E, Lilienfeld A, Shimkin M, Wynder E. 1959. Smoking and lung cancer. J. Nat. Cancer Inst. 22:173–203 [Google Scholar]
Daniel SR, Armstrong K, Silber JH, Rosenbaum PR. 2008. An algorithm for optimal tapered matching, with application to disparities in survival. J. Comp. Graph. Stat. 174:914–24 [Google Scholar]
Diprete TA, Gangl M. 2004. Assessing bias in the estimation of causal effects. Sociol. Methodol. 34:271–310 [Google Scholar]
Dwass M. 1960. Some k-sample rank order tests. Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling I Olkin 198–202 Stanford, CA: Stanford Univ. Press [Google Scholar]
Fisher RA. 1935. The Design of Experiments Edinburgh, UK: Oliver & Boyd
Gastwirth JL. 1992. Methods for assessing the sensitivity of statistical comparisons used in Title VII cases to omitted variables. Jurimetrics 33:19–34 [Google Scholar]
Gibbons RD, Amatya AK, Brown CH, Hur K, Marcus SM. et al. 2010. Post-approval drug safety surveillance. Annu. Rev. Public Health 31:419–37 [Google Scholar]
Grodstein F, Stampfer MJ, Manson JE, Colditz GA, Willet WC. et al. 1996. Postmenopausal estrogen and progestin use and the risk of cardiovascular disease. New Engl. J. Med. 335:453–61 [Google Scholar]
Hammond EC. 1964. Smoking in relation to mortality and morbidity. J. Nat. Cancer Inst. 32:1161–88 [Google Scholar]
Hahn J, Todd P, Van der Klaauw W. 2001. Identification and estimation of treatment effects with a regression-discontinuity design. Econometrica 69:201–9 [Google Scholar]
Hansen BB. 2007. Optmatch: flexible, optimal matching for observational studies. R News 7:18–24 package . [Google Scholar]
Hedayat AS, Sloane NJA, Stufken J. 1999. Orthogonal Arrays: Theory and Applications New York: Springer
Heller R, Rosenbaum PR, Small DS. 2010. Using the cross-match test to appraise covariate balance in matched pairs. Am. Stat. 64:299–309 [Google Scholar]
Hernán MA, Alonso A, Logan R, Grodstein F, Michels KB. et al. 2008. Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease. Epidemiology 19:766–79 [Google Scholar]
Holland PWH. 1988. Causal inference, path analysis, and recursive structural equations models. Sociol. Methodol. 18:449–84 [Google Scholar]
Hosman CA, Hansen BB, Holland PWH. 2010. The sensitivity of linear regression coefficients' confidence limits to the omission of a confounder. Ann. Appl. Stat. 4:849–70 [Google Scholar]
Hsu JY, Small DS, Rosenbaum PR. 2013. Effect modification and design sensitivity in observational studies. J. Am. Stat. Assoc. 108:135–48 [Google Scholar]
Imbens GW. 2003. Sensitivity to exogeneity assumptions in program evaluation. Am. Econ. Rev. 93:126–32 [Google Scholar]
Imbens GW, Rosenbaum PR. 2004. Robust, accurate confidence intervals with a weak instrument. J. R. Stat. Soc. A 168:109–26 [Google Scholar]
Imbens GW, Wooldridge JM. 2009. Recent developments in the econometrics of program evaluation. J. Econ. Lit. 47:5–86 [Google Scholar]
Keele L, Titiunik R, Zubizarreta JR. 2015. Enhancing a geographic regression discontinuity design through matching to estimate the effect of ballot initiatives on voter turnout. J. R. Stat. Soc. A. 178223–39
Liu W, Kuramoto SJ, Stuart EA. 2013. An introduction to sensitivity analysis for unobserved confounding in nonexperimental prevention research. Prevent. Sci. 14:570–80 [Google Scholar]
Lu B, Greevy R, Xu X, Beck C. 2011. Optimal nonbipartite matching and its statistical applications. Am. Stat. 65:21–30 package . [Google Scholar]
Lu B, Rosenbaum PR. 2004. Optimal pair matching with two control groups. J. Comp. Graph. Stat. 13:422–34 [Google Scholar]
Lund E, Bønaa KH. 1993. Reduced breast cancer mortality among fisherman's wives in Norway. Cancer Causes Cont. 4:283–87 [Google Scholar]
Manski CF. 1995. Identification Problems in the Social Sciences Cambridge, MA: Harvard Univ. Press
Manski CF, Nagin DS. 1998. Bounding disagreements about treatment effects: a case study of sentencing and recidivism. Sociol. Methodol. 28:99–137 [Google Scholar]
Marden JI. 1992. Use of nested orthogonal contrasts in analyzing rank data. J. Am. Stat. Assoc. 87:307–18 [Google Scholar]
Maritz JS. 1979. A note on exact robust confidence intervals for location. Biometrika 66:163–66 [Google Scholar]
Masjedi MR, Heidary A, Mohammadi F, Velayati AA, Dokouhaki P. 2000. Chromosome aberrations and micronuclei in lymphocytes of patients before and after exposure to anti-tuberculosis drugs. Mutagenesis 15:489–94 [Google Scholar]
Meyer BD. 1995. Natural and quasi-experiments in economics. J. Bus. Econ. Stat. 13:151–61 [Google Scholar]
Mudholkar GS, McDermott MP. 1989. A class of tests for the equality of ordered means. Biometrika 76:161–68 [Google Scholar]
Nagin DS, Weisburd D. 2013. Evidence and public policy: the example of evaluation of research in policing. Criminol. Public Policy 12:651–79 [Google Scholar]
Neel J. 2002. The marketing of menopause: historically, hormone therapy heavy on promotion, light on science. Washington, DC: Nat. Public Radio (8 August 2002) [Google Scholar]
Peto R. 1981. The horse-racing effect. Lancet 318:467–68 [Google Scholar]
Pomp ER, Van Stralen KJ, Le Cessie S, Vandenbroucke JP, Rosendaal FR, Doggen CJM. 2010. Experience with multiple control groups in a large population-based case–control study on genetic and environmental risk factors. Eur. J. Epidemiol. 25:459–66 [Google Scholar]
Randles RH, Hogg RV. 1971. Certain uncorrelated and independent rank statistics. J. Am. Stat. Assoc. 66:569–74 [Google Scholar]
Resnick SI. 1999. A Probability Path. Berlin: Birkhauser
Robins JM, Rotnitzky A, Scharfstein D. 1999. Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. Statistical Models in Epidemiology E Halloran, D Berry 1–94 New York: Springer [Google Scholar]
Rosenbaum PR. 1987. The role of a second control group in an observational study. Stat. Sci. 2:292–316 [Google Scholar]
Rosenbaum PR. 1988. Sensitivity analysis for matching with multiple controls. Biometrika 75:577–81 [Google Scholar]
Rosenbaum PR. 1991. Discussing hidden bias in observational studies. Ann. Intern. Med. 115:901–5 [Google Scholar]
Rosenbaum PR. 2001a. Replicating effects and biases. Am. Stat. 55:223–27 [Google Scholar]
Rosenbaum PR. 2001b. Stability in the absence of treatment. J. Am. Stat. Assoc. 96:210–19 [Google Scholar]
Rosenbaum PR. 2002. Observational Studies New York: Springer, 2nd ed..
Rosenbaum PR. 2006. Differential effects and generic biases in observational studies. Biometrika 93:573–86 [Google Scholar]
Rosenbaum PR. 2007. Sensitivity analysis for m-estimates, tests, and confidence intervals in matched observational studies. Biometrics 63:456–64 packages and . [Google Scholar]
Rosenbaum PR. 2008. Testing hypotheses in order. Biometrika 95:248–52 [Google Scholar]
Rosenbaum PR. 2010a. Design of Observational Studies New York: Springer
Rosenbaum PR. 2010b. Evidence factors in observational studies. Biometrika 97:333–45 [Google Scholar]
Rosenbaum PR. 2011. Some approximate evidence factors in observational studies. J. Am. Stat. Assoc. 106:285–95 and functions and data in the package . [Google Scholar]
Rosenbaum PR. 2012a. An exact adaptive test with superior design sensitivity in an observational study of treatments for ovarian cancer. Ann. Appl. Stat. 6:83–105 function in the package . [Google Scholar]
Rosenbaum PR. 2012b. Testing one hypothesis twice in observational studies. Biometrika 99:763–74 [Google Scholar]
Rosenbaum PR. 2013a. Impact of multiple matched controls on design sensitivity in observational studies. Biometrics 69:118–27 packages and . [Google Scholar]
Rosenbaum PR. 2013b. Using differential comparisons in observational studies. Chance 26:18–25 [Google Scholar]
Rosenbaum PR, Rubin DB. 1983. Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. J. R. Stat. Soc. B 45:212–18 [Google Scholar]
Rosenbaum PR, Silber JH. 2009. Amplification of sensitivity analysis in observational studies. J. Am. Stat. Assoc. 104:1398–405 function in the package . [Google Scholar]
Rosenbaum PR, Silber JH. 2013. Using the exterior match to compare two entwined matched control groups. Am. Stat. 67:67–75 [Google Scholar]
Rouse CE. 1995. Democratization or diversion? The effect of community colleges on educational attainment. J. Bus. Econ. Stat. 13:217–24 [Google Scholar]
Rubin DB. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66:688–701 [Google Scholar]
Rutter M, Acad. Med. Sci. Work. Group 2007. Identifying the Environmental Causes of Disease: How Should We Decide What to Believe and When to Take Action? London, UK: Acad. Med. Sci http://www.acmedsci.ac.uk/policy/policy/identifying-the-environmental-causes-of-disease/
Savage IR. 1957. On the independence of tests of randomness and other hypotheses. J. Am. Stat. Assoc. 52:53–57 [Google Scholar]
Shadish WR, Cook TD, Campbell DT. 2002. Experimental and Quasi-Experimental Designs for Generalized Causal Inference Boston: Houghton Mifflin
Silber JH, Lorch SA, Rosenbaum PR, Medoff-Cooper B, Bakewell-Sachs S. et al. 2009. Time to send the preemie home? Additional maturity at discharge and subsequent health care costs and outcomes. Health Serv. Res. 44:444–63 [Google Scholar]
Silber JH, Rosenbaum PR, Clark AS, Giantonio BJ, Ross RN. et al. 2013. Characteristics associated with differences in survival among black and white women with breast cancer. J. Am. Med. Assoc. 310:389–97 [Google Scholar]
Silber JH, Rosenbaum PR, Polsky D, Ross RN, Even-Shoshan O. et al. 2007. Does ovarian cancer treatment and survival differ by the specialty providing chemotherapy?. J. Clin. Oncol. 25:1169–75 [Google Scholar]
Small DS, Rosenbaum PR. 2008. War and wages: the strength of instrumental variables and their sensitivity to unobserved biases. J. Am. Stat. Assoc. 103:924–33 [Google Scholar]
Stuart EA, Hanna DB. 2013. Should epidemiologists be more sensitive to design sensitivity?. Epidemiology 24:88–89 [Google Scholar]
Stuart EA, Rubin DB. 2008a. Best practices in quasi-experimental designs. Best Practices in Quantitative Methods J Osborne 155–76 Thousand Oaks, CA: Sage [Google Scholar]
Stuart EA, Rubin DB. 2008b. Matching with multiple control groups with adjustment for group differences. J. Educ. Behav. Stat. 33:279–306 [Google Scholar]
Susser M. 1987. Falsification, verification and causal inference in epidemiology: Reconsideration in the light of Sir Karl Popper's philosophy. Epidemiology, Health and Society: Selected Papers M Susser 82–93 New York: Oxford Univ. Press [Google Scholar]
Terpstra TJ. 1952. Asymptotic normality and consistency of Kendall's test against trend, when ties are present in one ranking. Indag. Math. 14:327–33 [Google Scholar]
Thistlewaite DL, Campbell DT. 1960. Regression-discontinuity analysis: an alternative to the ex post facto experiment. J. Educ. Psychol. 51:309–17 [Google Scholar]
West SG, Duan N, Pequegnat W, Gaist P, Des Jarlais DC. et al. 2008. Alternatives to the randomized controlled trial. Am. J. Public Health 98:1359–66 [Google Scholar]
Wolfe DA. 1973. Some general results about uncorrelated statistics. J. Am. Stat. Assoc. 68:1013–18 [Google Scholar]
Women's Health Initiative Writing Group 2002. Risks and benefits of estrogen plus progestin in healthy postmenopausal women: principal results from the Women's Health Initiative randomized controlled trial. J. Am. Med. Assoc. 288:321–33 [Google Scholar]
Wu CFJ, Hamada MS. 2011. Experiments: Planning, Analysis, and Optimization Hoboken, NJ: John Wiley & Sons
Zaykin DV, Zhivotovsky LA, Westfall PH, Weir BS. 2002. Truncated product method for combining p-values. Genet. Epidemiol. 22:170–85 function in the package . [Google Scholar]
Zelen M. 1979. A new design for randomized clinical trials. New Eng. J. Med. 300:1242–45 [Google Scholar]
Zhang K, Small DS, Lorch S, Srinivas S, Rosenbaum PR. 2011. Using split samples and evidence factors in an observational study of neonatal outcomes. J. Am. Stat. Assoc. 106:511–24 [Google Scholar]
Zubizarreta JR, Neuman M, Silber JH, Rosenbaum PR. 2012. Contrasting evidence within and between institutions that provide treatment in an observational study of alternate forms of anesthesia. J. Am. Stat. Assoc. 107:901–15 [Google Scholar]
Zubizarreta JR, Paredes RD, Rosenbaum PR. 2014. Matching for balance, pairing for heterogeneity in an observational study of the effectiveness of for-profit and not-for-profit high schools in Chile. Ann. Appl. Stat. 8:204–31 [Google Scholar]
Zubizarreta JR, Small DS, Goyal NK, Lorch S, Rosenbaum PR. 2013. Stronger instruments via integer programming in an observational study of late preterm birth outcomes. Ann. Appl. Stat. 7:25–50 [Google Scholar]

/content/journals/10.1146/annurev-statistics-010814-020201

How to See More in Observational Studies: Some New Quasi-Experimental Devices

Annual Review of Statistics and Its Application 2, 21 (2015); https://doi.org/10.1146/annurev-statistics-010814-020201

/content/journals/10.1146/annurev-statistics-010814-020201

Data & Media loading...

Article Type: Review Article

Most Cited Most Cited RSS feed

- Functional Data Analysis
  
  Jane-Ling Wang, Jeng-Min Chiou, and Hans-Georg Müller
  
  Vol. 3 (2016), pp. 257–295
- Probabilistic Forecasting
  
  Tilmann Gneiting, and Matthias Katzfuss
  
  Vol. 1 (2014), pp. 125–151
- Bayesian Computing with INLA: A Review
  
  Håvard Rue, Andrea Riebler, Sigrunn H. Sørbye, Janine B. Illian, Daniel P. Simpson, and Finn K. Lindgren
  
  Vol. 4 (2017), pp. 395–421
- Functional Regression
  
  Jeffrey S. Morris
  
  Vol. 2 (2015), pp. 321–359
- Topological Data Analysis
  
  Larry Wasserman
  
  Vol. 5 (2018), pp. 501–532
- Algorithmic Fairness: Choices, Assumptions, and Definitions
  
  Shira Mitchell, Eric Potash, Solon Barocas, Alexander D'Amour, and Kristian Lum
  
  Vol. 8 (2021), pp. 141–163
- Microbiome, Metagenomics, and High-Dimensional Compositional Data Analysis
  
  Hongzhe Li
  
  Vol. 2 (2015), pp. 73–94
- Learning Deep Generative Models
  
  Ruslan Salakhutdinov
  
  Vol. 2 (2015), pp. 361–385
- On p-Values and Bayes Factors
  
  Leonhard Held, and Manuela Ott
  
  Vol. 5 (2018), pp. 393–419
- High-Dimensional Statistics with a View Toward Applications in Biology
  
  Peter Bühlmann, Markus Kalisch, and Lukas Meier
  
  Vol. 1 (2014), pp. 255–278
More Less

Annual Review of Statistics and Its Application

Volume 2, 2015

Review Article

Free

How to See More in Observational Studies: Some New Quasi-Experimental Devices

Abstract

Most Read This Month

Most Cited Most Cited RSS feed