Nonprobability Sampling and Causal Analysis

Ulrich Kohler; Frauke Kreuter; Elizabeth A. Stuart

doi:10.1146/annurev-statistics-030718-104951

Annual Review of Statistics and Its Application

Volume 6, 2019

Review Article

Free

Nonprobability Sampling and Causal Analysis

Ulrich Kohler¹, Frauke Kreuter^2,3,4, and Elizabeth A. Stuart⁵
View Affiliations Hide Affiliations

Affiliations: ¹Faculty of Economics and Social Sciences, University of Potsdam, 14482 Potsdam, Germany; email: [email protected] ²Joint Program in Survey Methodology, University of Maryland, College Park, Maryland 20742, USA; email: [email protected] ³School of Social Sciences, University of Mannheim, 68131 Mannheim, Germany ⁴Statistical Methods Research Department, Institute for Employment Research (IAB), 90478 Nuremberg, Germany ⁵Department of Mental Health, Department of Biostatistics, and Department of Health Policy and Management, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland 21205, USA; email: [email protected]
Vol. 6:149-172 (Volume publication date March 2019) https://doi.org/10.1146/annurev-statistics-030718-104951
First published as a Review in Advance on September 12, 2018
Copyright © 2019 by Annual Reviews. All rights reserved

Abstract

The long-standing approach of using probability samples in social science research has come under pressure through eroding survey response rates, advanced methodology, and easier access to large amounts of data. These factors, along with an increased awareness of the pitfalls of the nonequivalent comparison group design for the estimation of causal effects, have moved the attention of applied researchers away from issues of sampling and toward issues of identification. This article discusses the usability of samples with unknown selection probabilities for various research questions. In doing so, we review assumptions necessary for descriptive and causal inference and discuss research strategies developed to overcome sampling limitations.

Keyword(s): big data, causal inference, generalizability, heterogeneous treatment effects, measurement error, nonprobability sampling, self-selection, validity

Article metrics loading...

/content/journals/10.1146/annurev-statistics-030718-104951

2019-03-07

2024-04-20

Full text loading...

/deliver/fulltext/statistics/6/1/annurev-statistics-030718-104951.html?itemId=/content/journals/10.1146/annurev-statistics-030718-104951&mimeType=html&fmt=ahah

Literature Cited

Angrist JD, Pischke JS 2009. Mostly Harmless Econometrics Princeton, NJ: Princeton Univ. Press
ASA (Am. Stat. Assoc.). 2016. ASA statement on statistical significance and p-values. Am. Stat. 70:131–33
[Google Scholar]
Athey S, Imbens GW. 2017. The state of applied econometrics: causality and policy evaluation. J. Econ. Perspect. 31:3–32
[Google Scholar]
Baker R, Brick JM, Bates NA, Battaglia M, Couper MP et al. 2013. Report of the AAPOR task force on non-probability sampling. Rep., Am. Assoc. Public Opin. Res., Oakbrook Terrace, IL. https://www.aapor.org/Education-Resources/Reports/Non-Probability-Sampling.aspx
[Google Scholar]
Berkson J. 1946. Limitations of the application of fourfold table analysis to hospital data. Epidemiology 2:47–53
[Google Scholar]
Bethlehem J. 2015. Essay: Sunday shopping—the case of three surveys. Surv. Res. Methods 9:221–30
[Google Scholar]
Bethlehem J 2017. The perils of non-probability sampling Presentation at Inference from Non Probability Samples, Paris, March 16–17. https://www.europeansurveyresearch.org/conference/non-probability
Bethlehem J, Cobben F, Schouten B 2011. Handbook of Nonresponse in Household Surveys New York: Wiley
Bia M, Mattei A. 2008. A Stata package for the estimation of the dose-response function through adjustment for the generalized propensity score. Stata J. 8:354–73
[Google Scholar]
Biemer PA. 2010. Total survey error: design, implementation, and evaluation. Public Opin. Q. 74:817–48
[Google Scholar]
Biemer PP, de Leeuw E, Eckman S, Edwards B, Kreuter F et al. 2017. Total Survey Error in Practice New York: Wiley
Caliendo M, Kühn S. 2011. Start-up subsidies for the unemployed: long-term evidence and effect heterogeneity. J. Public Econ. 95:311–31
[Google Scholar]
Callegaro M, Baker RP, Bethlehem J, Goritz AS, Krosnick JA, Lavrakas PJ 2014. Online Panel Research: A Data Quality Perspective New York: Wiley
Cattaneo M. 2010. Efficient semiparametric estimation of multi-valued treatment effects under ignorability. J. Econom. 155:138–54
[Google Scholar]
Cattaneo M, Farrel MH. 2011. Efficient estimation of the dose-response function under ignorability using subclassification on the covariates. Missing Data Methods: Cross-sectional Methods and Applications DM Druker93–127 Bingley, UK: Emerald
[Google Scholar]
Dafoe A. 2014. Science deserves better: the imperative to share complete replication files. PS Political Sci. Politics 47:60–66
[Google Scholar]
Dawid AP. 2015. Statistical causality from a decision-theoretic perspective. Annu. Rev. Stat. Appl. 2:273–303
[Google Scholar]
Desenclos J, Klontz K, Wilder M, Gunn R. 1992. The protective effect of alcohol on the occurrence of epidemic oyster-borne hepatitis A. Epidemiology 3:371–74
[Google Scholar]
Dever J, Rafferty A, Valliant R. 2008. Internet surveys: Can statistical adjustments eliminate coverage bias?. Surv. Res. Methods 2:47–62
[Google Scholar]
Dewald WG, Thursby JG, Anderson RG. 1986. Replication in empirical economics: the Journal of Money, Credit and Banking project. Am. Econ. Rev. 76:587–603
[Google Scholar]
Dutwin D, Buskirk TD. 2017. Apples to oranges or gala versus golden delicious? Comparing data quality of nonprobability internet samples to low response rate probability samples. Public Opin. Q. 81:213–49
[Google Scholar]
Eckstein H. 1975. Case study and theory in political science. Handbook of Political Science Vol. 1: Political Science: Scope and Theory FI Greenstein, NW Polsby117–76 Boston: Addison-Wesley
[Google Scholar]
Elliott MR, Valliant R. 2017. Inference for nonprobability samples. Stat. Sci. 32:249–64
[Google Scholar]
Elwert F. 2013. Graphical causal model. Handbook of Causal Analysis for Social Research S Morgan245–73 Dordrecht, Neth.: SpringerA friendly introduction to graphical causal models.
[Google Scholar]
Elwert F, Winship C. 2014. Endogenous selection bias: The problem of conditioning on a collider variable. Annu. Rev. Sociol. 40:31–53
[Google Scholar]
Fisher R. 1935. The logic of inductive inference. J. R. Stat. Soc. A 98:39–54
[Google Scholar]
Gelman A. 2014. The statistical crisis in science. Am. Sci. 102:460–65
[Google Scholar]
Gelman A, Stern H. 2006. The difference between “significant” and “not significant” is not itself statistically significant. Am. Stat. 60:328–31
[Google Scholar]
Greenland S. 2003. Quantifying biases in causal models: classical confounding versus collider-stratification bias. Epidemiology 14:300–5
[Google Scholar]
Groves RM. 2006. Nonresponse rates and nonresponse bias in household surveys. Public Opin. Q. 70:646–75
[Google Scholar]
Groves RM, Fowler FJ Jr., Couper MP, Lepkowski JM, Singer E, Tourangeau R 2011. Survey Methodology New York: Wiley
Hernán M, Hernández-Diaz S, Robins J. 2004. A structural approach to selection bias. Epidemiology 155:174–84
[Google Scholar]
Hirano K, Imbens GW. 2004. The propensity score with continuous treatment. Applied Bayesian Modelling and Causal Inference from Missing Data Perspectives A Gelman, X Meng73–84 New York: Wiley
[Google Scholar]
Holland P. 1986. Statistics and causal inference. J. Am. Stat. Assoc. 81:945–60
[Google Scholar]
Imai K, King G, Stuart EA. 2008. Misunderstandings between experimentalists and observationalists about causal inference. J. R. Stat. Soc. 171:481–502Discusses similar problems to the present article.
[Google Scholar]
Imai K, van Dyk D. 2004. Causal treatment with general treatment regimes: generalizing the propensity score. J. Am. Stat. Assoc. 99:854–66
[Google Scholar]
Imbens GW. 2000. The role of the propensity score in estimating dose-response functions. Biometrika 3:706–10
[Google Scholar]
Kaizar EE 2018. Combining data in a single analysis Paper presented at AAAS Annual Meeting, Austin, TX, Feb. 18
Keiding N, Louis TA. 2016. Perils and potentials of self-selected entry to epidemiological studies and surveys. J. R. Stat. Soc. A 179:319–76
[Google Scholar]
Keiding N, Louis TA. 2018. Web-based enrollment and other types of self-selection in surveys and studies: consequences for generalizability. Annu. Rev. Stat. Appl. 5:25–47Can be read as a refutation of Rothman et al. (2013).
[Google Scholar]
Kennedy C, Mercer A, Keeter S, Hatley N, McGeeney K, Gimenez A 2016. Evaluating Online Nonprobability Surveys Washington, DC: Pew Res. Cent.
Kim J, Pearl J 1983. A computational model for causal and diagnostic reasoning in inference systems. Proceedings of the Eighth International Joint Conference on Artificial Intelligence 1:190–93 San Francisco: Morgan Kaufmann
[Google Scholar]
King G. 1991. Stochastic variation: a comment on Lewis-Beck and Skalaban's “The R-square.”. Political Anal. 2:158–200
[Google Scholar]
King G. 1995. Replication, replication. PS Political Sci. Politics 18:443–99
[Google Scholar]
King G, Keohane RO, Verba S 1994. Designing Social Inquiry Princeton, NJ: Princeton Univ. Press
Kohler U. 2015. Editorial: maintaining quality. Surv. Res. Methods 9:139–40
[Google Scholar]
Kreuter F, Valliant R. 2007. A survey on survey statistics: what is done and can be done in Stata. Stata J. 7:1–21
[Google Scholar]
Lakens D. 2017. Equivalence tests: a practical primer for t tests, correlations, and meta-analyses. Soc. Psychol. Personal. Sci. 8:355–62
[Google Scholar]
Lazarsfeld PF, Berelson B, Gaudet H 1948. The People's Choice: How the Voter Makes Up His Mind in a Presidential Campaign New York: Columbia Univ. Press
Lee BK, Lessler J, Stuart EA. 2010. Improving propensity score weighting using machine learning. Stat. Med. 29:337–46
[Google Scholar]
Leibenstein H. 1950. Bandwagon, snob, and Veblen effects in the theory of consumers’ demand. J. Econ. 64:183–207
[Google Scholar]
Little RJ, West BT, Boonstra PS, Hu J 2018. Measures of the degree of departure from ignorable sample selection Paper presented at 73rd Annual Conference of the American Association for Public Opinion Research, Denver, CO, May 16–19
Lohr SL, Raghunathan TE. 2017. Combining survey data with other data sources. Stat. Sci. 32:293–312
[Google Scholar]
McShane BB, Gal D, Gelman A, Robert C, Tackett JL 2018. Abandon statistical significance. arXiv:1709.07588 [stat.ME]
[Google Scholar]
Mercer A 2018. Selection bias in nonprobability surveys. A causal inference approach PhD thesis, Univ. Maryland Book-length treatise on nonprobability surveys.
Miettinen O 1985. Theoretical Epidemiology New York: Wiley
Mill JS 1843. A System of Logic, Ratiocinative and Inductive Vol. 1. London: John W. Parker
Morgan S, Winship C 2007. Counterfactuals and Causal Inference: Methods and Principles for Social Research Cambridge, UK: Cambridge Univ. Press
Mutz DC 2011. Population-Based Survey Experiments Princeton, NJ: Princeton Univ. Press
National Academies of Sciences, Engineering, and Medicine 2017a. Federal Statistics, Multiple Data Sources, and Privacy Protection: Next Steps Washington, DC: Natl. Acad. Press
National Academies of Sciences, Engineering, and Medicine 2017b. Innovations in Federal Statistics: Combining Data Sources while Protecting Privacy Washington, DC: Natl. Acad. Press
Neyman JS. 1934. On the two different aspects of the representative method: the method of stratified sampling and the method of purposive selection. J. R. Stat. Soc. 97:558–625
[Google Scholar]
Neyman JS, Iwaszkiewicz K, Kolodziejczyk S. 1935. Statistical problems in agricultural experimentation. Suppl. J. R. Stat. Soc. 2:107–80
[Google Scholar]
Nuzzo R. 2015. How scientists fool themselves—and how they can stop. Nature 526:182–85
[Google Scholar]
O'Muircheartaigh C, Hedges L. 2014. Generalizing from unrepresentative experiments: a stratified propensity score approach. J. R. Stat. Soc. C 63:195–210
[Google Scholar]
Pearl J. 1993. Comment: graphical models, causality, and interventions. Stat. Sci. 8:266–69
[Google Scholar]
Pearl J. 1995. Causal diagrams for empirical research. Biometrika 82:669–710
[Google Scholar]
Pearl J 2009. Causality: Models, Reasoning, and Inference Cambridge, UK: Cambridge Univ. Press. 2nd ed.
Pearl J, Bareinboim E. 2014. External validity: from do-calculus to transportability across populations. Stat. Sci. 29:579–95
[Google Scholar]
Popper KR 1962. Conjectures and Refutations: the Growth of Scientific Knowledge London: Routledge
Popper KR 1982. The Open Universe: An Argument for Indeterminism. Postscript to The Logic of Scientific Discovery 2 London: Hutchinson
Rivers D 2007. Sampling for web surveys Paper presented at the 2007 Joint Statistical Meetings, Salt Lake City, UT
Rivers D, Bailey D 2009. Inference from matched samples in the 2008 US national elections Paper presented at the American Association for Public Opinion Research Annual Conference, Hollywood, Florida, March 14–17
Robbins M, Ghosh-Dastidar B, R. R 2017. Blending of probability and convenience samples as applied to a survey of military caregivers Presentation at Inference from Nonprobability Samples, Washington, DC, September 25
Rosenbaum PR. 2015. How to see more in observational studies: Some new quasi-experimental devices. Annu. Rev. Stat. Appl. 2:21–48
[Google Scholar]
Rothman KJ, Gallacher JE, Hatch EE. 2013. Why representativeness should be avoided. Int. J. Epidemiol. 42:1012–14Argues very strongly against probability samples for causal inference.
[Google Scholar]
Rubin DB. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66:688–701
[Google Scholar]
Rubin DB. 2001. Estimating the causal effects of smoking. Stat. Med. 20:1395–414
[Google Scholar]
Selvin HC. 1958. Durkheim's Suicide and problems of empirical research. Am. J. Sociol. 63:607–19
[Google Scholar]
Shpitser I, VanderWeele TJ, Robins JM 2010. On the validity of covariate adjustment for estimating causal effects. Proceedings of the 26th Conference on Uncertainty and Artificial Intelligence Corvallis, OR: AUAI Press
[Google Scholar]
Statistics Canada 2017. Data quality toolkit. Statistics Canada https://www.statcan.gc.ca/eng/data-quality-toolkit
[Google Scholar]
Stuart EA, Bradshaw CP, Leaf PJ. 2015. Assessing the generalizability of randomized trial results to target populations. Prev. Sci. 16:475–85
[Google Scholar]
Stuart EA, Cole SR, Bradshaw CA, Leaf PJ. 2001. The use of propensity scores to assess the generalizability of results from randomized trials. J. R. Stat. Soc. A 174:369–86
[Google Scholar]
Tourangeau R, Brick JM, Li J. 2017. Adaptive and responsive survey designs: a review and assessment. J. R. Stat. Soc. A 180:202–23
[Google Scholar]
Valliant R, Dever J. 2011. Estimating propensity adjustments for volunteer web surveys. Sociol. Methods Res. 40:105–37
[Google Scholar]
Valliant R, Dever J, Kreuter F 2018. A Practical Guide to Designing and Weighting Survey Samples New York: Springer. 2nd ed.
VanderWeele T, Shpister I. 2011. A new criterion for confounder selection. Biometrics 67:1406–13
[Google Scholar]
Wang W, Rothschild D, Goel S, Gelman A. 2015. Forecasting elections with non-representative polls. Int. J. Forecast. 31:980–91
[Google Scholar]
Wasserstein RL, Lazar NA. 2017. The ASA's statement on p-values: context, process, and purpose. Am. Stat. 70:129–31
[Google Scholar]
Winship C, Morgan S. 1999. The estimation of causal effects from observational data. Annu. Rev. Sociol. 25:659–707Introduces various techniques to estimate causal effects from observational data.
[Google Scholar]
Wooldridge JM 2009. Introductory Econometrics: A Modern Approach Mason, OH: South-Western. 4th ed.
Yeager DS, Krosnick JA, Chang L, Javitz HS, Levendusky MS et al. 2011. Comparing the accuracy of RDD telephone surveys and Internet surveys conducted with probability and non-probability samples. Public Opin. Q. 75:709–47
[Google Scholar]
Zubizarreta JR, Small DS, Rosenbaum PR. 2014. Isolation in the construction of natural experiments. Ann. Appl. Stat. 8:2096–121
[Google Scholar]

/content/journals/10.1146/annurev-statistics-030718-104951

Nonprobability Sampling and Causal Analysis

Annual Review of Statistics and Its Application 6, 149 (2019); https://doi.org/10.1146/annurev-statistics-030718-104951

/content/journals/10.1146/annurev-statistics-030718-104951

Data & Media loading...

Article Type: Review Article

Most Cited Most Cited RSS feed

- Probabilistic Forecasting
  
  Tilmann Gneiting, and Matthias Katzfuss
  
  Vol. 1 (2014), pp. 125–151
- Functional Data Analysis
  
  Jane-Ling Wang, Jeng-Min Chiou, and Hans-Georg Müller
  
  Vol. 3 (2016), pp. 257–295
- Bayesian Computing with INLA: A Review
  
  Håvard Rue, Andrea Riebler, Sigrunn H. Sørbye, Janine B. Illian, Daniel P. Simpson, and Finn K. Lindgren
  
  Vol. 4 (2017), pp. 395–421
- Functional Regression
  
  Jeffrey S. Morris
  
  Vol. 2 (2015), pp. 321–359
- Topological Data Analysis
  
  Larry Wasserman
  
  Vol. 5 (2018), pp. 501–532
- Algorithmic Fairness: Choices, Assumptions, and Definitions
  
  Shira Mitchell, Eric Potash, Solon Barocas, Alexander D'Amour, and Kristian Lum
  
  Vol. 8 (2021), pp. 141–163
- Microbiome, Metagenomics, and High-Dimensional Compositional Data Analysis
  
  Hongzhe Li
  
  Vol. 2 (2015), pp. 73–94
- Learning Deep Generative Models
  
  Ruslan Salakhutdinov
  
  Vol. 2 (2015), pp. 361–385
- On p-Values and Bayes Factors
  
  Leonhard Held, and Manuela Ott
  
  Vol. 5 (2018), pp. 393–419
- High-Dimensional Statistics with a View Toward Applications in Biology
  
  Peter Bühlmann, Markus Kalisch, and Lukas Meier
  
  Vol. 1 (2014), pp. 255–278
More Less

Annual Review of Statistics and Its Application

Volume 6, 2019

Review Article

Free

Nonprobability Sampling and Causal Analysis

Abstract

Most Read This Month

Most Cited Most Cited RSS feed