Machine Learning for Social Science: An Agnostic Approach

Justin Grimmer; Margaret E. Roberts; Brandon M. Stewart

doi:10.1146/annurev-polisci-053119-015921

Annual Review of Political Science

Volume 24, 2021

Review Article

Open Access

Machine Learning for Social Science: An Agnostic Approach

Justin Grimmer¹, Margaret E. Roberts², and Brandon M. Stewart³
View Affiliations Hide Affiliations

Affiliations: ¹Department of Political Science and Hoover Institution, Stanford University, Stanford, California 94305, USA; email: jgrimmer@stanford.edu ²Department of Political Science and Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, California 92093, USA; email: meroberts@ucsd.edu ³Department of Sociology and Office of Population Research, Princeton University, Princeton, New Jersey 08540, USA; email: bms4@princeton.edu
Vol. 24:395-419 (Volume publication date May 2021) https://doi.org/10.1146/annurev-polisci-053119-015921
First published as a Review in Advance on March 05, 2021
Copyright © 2021 by Annual Reviews.

This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See credit lines of images or other third-party material in this article for license information

Abstract

Social scientists are now in an era of data abundance, and machine learning tools are increasingly used to extract meaning from data sets both massive and small. We explain how the inclusion of machine learning in the social sciences requires us to rethink not only applications of machine learning methods but also best practices in the social sciences. In contrast to the traditional tasks for machine learning in computer science and statistics, when machine learning is applied to social scientific data, it is used to discover new concepts, measure the prevalence of those concepts, assess causal effects, and make predictions. The abundance of data and resources facilitates the move away from a deductive social science to a more sequential, interactive, and ultimately inductive approach to inference. We explain how an agnostic approach to machine learning methods focused on the social science tasks facilitates progress across a wide range of questions.

Keyword(s): machine learning, research design, text as data

Article metrics loading...

/content/journals/10.1146/annurev-polisci-053119-015921

2021-05-11

2025-04-06

The full text of this item is not currently available.

Literature Cited

Acharya A, Bansak K, Hainmueller J. 2021. Combining outcome-based and preference-based matching: the g-constrained priority mechanism. Political Anal In press
[Google Scholar]
Ahlquist JS, Breunig C. 2012. Model-based clustering and typologies in the social sciences. Political Anal 20:92–112
[Google Scholar]
Airoldi EM, Blei DM, Fienberg SE, Xing EP. 2008. Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9:1981–2014
[Google Scholar]
Aronow PM, Miller BT. 2019. Foundations of Agnostic Regression Cambridge, UK: Cambridge Univ. Press
[Google Scholar]
Ashworth S, Berry CR, De Mesquita EB. 2015. All else equal in theory and data (big or small). PS: Political Sci. Politics 48:89–94
[Google Scholar]
Athey S, Imbens GW 2016. Recursive partitioning for heterogeneous causal effects. PNAS 113:7353–60
[Google Scholar]
Athey S, Imbens GW. 2019. Machine learning methods that economists should know about. Annu. Rev. Econ. 11:685–725
[Google Scholar]
Barberá P, Boydstun AE, Linn S, McMahon R, Nagler J. 2021. Automated text classification of news articles: a practical guide. Political Anal 29:19–42
[Google Scholar]
Baumer EPS, Mimno D, Guha S, Quan E, Gay GK. 2017. Comparing grounded theory and topic modeling: extreme divergence or unlikely convergence?. J. Assoc. Inform. Sci. Technol. 68:1397–410
[Google Scholar]
Beck N, King G, Zeng L. 2000. Improving quantitative studies of international conflict. Am. Political Sci. Rev. 94:21–36
[Google Scholar]
Benjamin R. 2019. Race After Technology: Abolitionist Tools for the New Jim Code Cambridge, UK: Wiley
[Google Scholar]
Benoit K, Conway D, Lauderdale B, Laver M, Mikhaylov S. 2016. Crowd-sourced text analysis: reproducible and agile production of political data. Am. Political Sci. Rev. 110:278–95
[Google Scholar]
Bisbee J. 2019. BARP: improving Mister P using Bayesian additive regression trees. Am. Political Sci. Rev. 113:1060–65
[Google Scholar]
Bishop C. 2006. Pattern Recognition and Machine Learning New York: Springer
[Google Scholar]
Blaydes L, Grimmer J. 2020. Political cultures: measuring values heterogeneity. Political Sci. Res. Methods 8:571–79
[Google Scholar]
Blaydes L, Linzer DA. 2008. The political economy of women's support for fundamentalist Islam. World Politics 60:576–609
[Google Scholar]
Blei DM, Ng AY, Jordan MI. 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3:993–1022
[Google Scholar]
Bonica A. 2013. Ideology and interests in the political marketplace. Am. J. Political Sci. 57:294–311
[Google Scholar]
Breiman L. 2001. Statistical modeling: the two cultures. Stat. Sci. 16:199–215
[Google Scholar]
Carlson D, Montgomery JM. 2017. A pairwise comparison framework for fast, flexible, and reliable human coding of political texts. Am. Political Sci. Rev. 111:835–43
[Google Scholar]
Chang J, Gerrish S, Wang C, Boyd-Graber JL, Blei DM. 2009. Reading tea leaves: how humans interpret topic models. Proceedings of the 22nd International Conference on Neural Information Processing Systems288–96 Red Hook, NY: Curran Assoc.
[Google Scholar]
Chatman JA, Flynn FJ. 2005. Full-cycle micro-organizational behavior research. Organ. Sci. 16:434–47
[Google Scholar]
Chen JKT, Valliant RL, Elliott MR. 2019. Calibrating non-probability surveys to estimated control totals using LASSO, with an application to political polling. J. R. Stat. Soc. Ser. C Appl. Stat. 68:657–81
[Google Scholar]
Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C et al. 2017. Double/debiased machine learning for treatment and structural parameters. Econom. J. 21:C1–68
[Google Scholar]
Clinton J, Jackman S, Rivers D. 2004. The statistical analysis of roll call data. Am. Political Sci. Rev. 98:355–70
[Google Scholar]
D'Amour A, Ding P, Feller A, Lei L, Sekhon J. 2020. Overlap in observational studies with high-dimensional covariates. J. Econom. 221:644–54
[Google Scholar]
Dawid AP, Skene AM. 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. J. R. Stat. Soc. Ser. C Appl. Stat. 28:20–28
[Google Scholar]
de Marchi S, Stewart BM 2020. Computational and machine learning models: the necessity of connecting theory and empirics. SAGE Handbook of Research Methods in Political Science and International Relations L Curini, R Franzese 289–310 London: SAGE
[Google Scholar]
Denny MJ, Spirling A. 2018. Text preprocessing for unsupervised learning: why it matters, when it misleads, and what to do about it. Political Anal 26:168–89
[Google Scholar]
Donoho D. 2017. 50 years of data science. J. Comput. Graph. Stat. 26:745–66
[Google Scholar]
Dorie V, Hill J, Shalit U, Scott M, Cervone D et al. 2019. Automated versus do-it-yourself methods for causal inference: lessons learned from a data analysis competition. Stat. Sci. 34:43–68
[Google Scholar]
Efron B, Gong G. 1983. A leisurely look at the bootstrap, the jackknife, and cross-validation. Am. Stat. 37:36–48
[Google Scholar]
Egami N, Fong CJ, Grimmer J, Roberts ME, Stewart BM. 2018. How to make causal inferences using texts. arXiv:1802.02163 [stat.ML]
Erosheva EA, Fienberg SE, Joutard C. 2007. Describing disability through individual-level mixture models for multivariate binary data. Ann. Appl. Stat. 1:502–37
[Google Scholar]
Fong C, Grimmer J. 2016. Discovery of treatments from text corpora. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)1600–9 Stroudsburg, PA: Assoc. Comput. Ling.
[Google Scholar]
Fong CJ, Grimmer J. 2020. Causal inference with latent treatments Work. Pap., Dep. Political Sci., Stanford Univ. Stanford, CA:
[Google Scholar]
Fraley C. 1998. Algorithms for model-based Gaussian hierarchical clustering. SIAM J. Sci. Comput. 20:270–81
[Google Scholar]
Fraley C, Raftery A. 2002. Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97:611
[Google Scholar]
Frey BJ, Dueck D. 2007. Clustering by passing messages between data points. Science 315:972–76
[Google Scholar]
Gentzkow M, Shapiro JM. 2008. Competition and truth in the market for news. J. Econ. Perspect. 22:133–54
[Google Scholar]
Gentzkow M, Shapiro JM, Taddy M. 2019. Measuring polarization in high-dimensional data: method and application to congressional speech. Econometrica 87:1307–40
[Google Scholar]
Ghitza Y, Gelman A. 2013. Deep interactions with MRP: election turnout and voting patterns among small electoral subgroups. Am. J. Political Sci. 57:762–76
[Google Scholar]
Glaser BG, Strauss AL. 1967. The Discovery of Grounded Theory: Strategies for Qualitative Research New York: Aldine de Gruyter
[Google Scholar]
Goodfellow I, Bengio Y, Courville A, Bengio Y. 2016. Deep Learning Cambridge, MA: MIT Press
[Google Scholar]
Grimmer J. 2010. A Bayesian hierarchical topic model for political texts: measuring expressed agendas in Senate press releases. Political Anal 18:1–35
[Google Scholar]
Grimmer J, Messing S, Westwood SJ. 2017. Estimating heterogeneous treatment effects and the effects of heterogeneous treatments with ensemble methods. Political Anal 25:413–34
[Google Scholar]
Grimmer J, Westwood SJ, Messing S. 2014. The Impression of Influence: Legislator Communication, Representation, and Democratic Accountability Princeton, NJ: Princeton Univ. Press
[Google Scholar]
Hainmueller J, Hazlett C. 2014. Kernel regularized least squares: reducing misspecification bias with a flexible and interpretable machine learning approach. Political Anal 22:143–68
[Google Scholar]
Hansen MH, Kooperberg C, Truong YK, Stone CJ. 1997. Polynomial splines and their tensor products in extended linear modeling: 1994 Wald memorial lecture. Ann. Stat. 25:1371–470
[Google Scholar]
Hastie T, Tibshirani R, Friedman J. 2013. The Elements of Statistical Learning New York: Springer
[Google Scholar]
Hill DW Jr., Jones ZM. 2014. An empirical evaluation of explanations for state repression. Am. Political Sci. Rev. 108:661–87
[Google Scholar]
Hill JL. 2011. Bayesian nonparametric modeling for causal inference. J. Comput. Graph. Stat. 20:217–40
[Google Scholar]
Hillard D, Purpura S, Wilkerson J. 2008. Computer-assisted topic classification for mixed-methods social science research. J. Inf. Technol. Politics 4:31–46
[Google Scholar]
Holland PW. 1986. Statistics and causal inference. J. Am. Stat. Assoc. 81:945–60
[Google Scholar]
Humphreys M, Sanchez de la Sierra R, Van der Windt P. 2013. Fishing, commitment, and communication: a proposal for comprehensive nonbinding research registration. Political Anal 21:1–20
[Google Scholar]
Imai K, Ratkovic M. 2013. Estimating treatment effect heterogeneity in randomized program evaluation. Ann. Appl. Stat. 7:443–70
[Google Scholar]
Imai K, Tingley D. 2012. A statistical method for empirical testing of competing theories. Am. J. Political Sci. 56:218–36
[Google Scholar]
Jacobi C, Van Atteveldt W, Welbers K. 2016. Quantitative analysis of large amounts of journalistic texts using topic modelling. Digital J 4:89–106
[Google Scholar]
Jamal AA, Keohane RO, Romney D, Tingley D. 2015. Anti-Americanism and anti-interventionism in Arabic Twitter discourses. Perspect. Politics 13:55–73
[Google Scholar]
Johansson F, Shalit U, Sontag D 2016. Learning representations for counterfactual inference. 33rd International Conference on Machine Learning, ICML 2016, Vol. 6 KQ Weinberger, MF Balcan 4407–18 New York: Int. Machine Learning Soc.
[Google Scholar]
Karell D, Freedman M. 2019. Rhetorics of radicalism. Am. Sociol. Rev. 84:726–53
[Google Scholar]
Kaufman AR, Kraft P, Sen M. 2019. Improving Supreme Court forecasting using boosted decision trees. Political Anal 27:381–87
[Google Scholar]
Keith KA, Jensen D, O'Connor B. 2020. Text and causal inference: a review of using text to remove confounding from causal estimates. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics5332–44 Stroudsburg, PA: Assoc. Comput. Linguist.
[Google Scholar]
King G, Keohane RO, Verba S. 1995. The importance of research design in political science. Am. Political Sci. Rev. 89:454–81
[Google Scholar]
Kingma DP, Welling M. 2019. An introduction to variational autoencoders. Found. Trends Machine Learn. 12:307–92
[Google Scholar]
Knox D, Lucas C. 2021. A dynamic model of speech for the social sciences. Am. Political Sci. Rev. In press
[Google Scholar]
Koford K, Poole KT, Rosenthal H. 1991. On dimensionalizing roll call votes in the US Congress. Am. Political Sci. Rev. 85:955–75
[Google Scholar]
Künzel SR, Sekhon JS, Bickel PJ, Yu B 2019. Metalearners for estimating heterogeneous treatment effects using machine learning. PNAS 116:4156–65
[Google Scholar]
Lalonde R. 1986. Evaluating the econometric evaluations of training programs. Am. Econ. Rev. 76:604–20
[Google Scholar]
Lax JR, Phillips JH. 2009. How should we estimate public opinion in the states?. Am. J. Political Sci. 53:107–21
[Google Scholar]
Levine J, Carmines EG, Sniderman PM. 1999. The empirical dimensionality of racial stereotypes. Public Opin. Q. 63:371–84
[Google Scholar]
Liberman M. 2010. Fred Jelinek. Comput. Linguist. 36:595–99
[Google Scholar]
Lieberman ES. 2005. Nested analysis as a mixed-method strategy for comparative research. Am. Political Sci. Rev. 99:435–52
[Google Scholar]
Lin W. 2013. Agnostic notes on regression adjustments to experimental data: reexamining Freedman's critique. Ann. Appl. Stat. 7:295–318
[Google Scholar]
Lundberg I, Johnson R, Stewart BM 2021. What is your estimand? Defining the target quantity connects statistical evidence to theory. Am. Sociol. Rev. In press
[Google Scholar]
McGhee E, Masket S, Shor B, Rogers S, McCarty N 2014. A primary cause of partisanship? Nomination systems and legislator ideology. Am. J. Political Sci. 58:337–51
[Google Scholar]
Molina M, Garip F. 2019. Machine learning for sociology. Annu. Rev. Sociol. 45:27–45
[Google Scholar]
Monroe B, Colaresi M, Quinn K. 2008. Fightin’ words: lexical feature selection and evaluation for identifying the content of political conflict. Political Anal 16:372–403
[Google Scholar]
Montgomery JM, Olivella S. 2018. Tree-based models for political science data. Am. J. Political Sci. 62:729–44
[Google Scholar]
Mozer R, Miratrix L, Kaufman AR, Anastasopoulos LJ. 2020. Matching with text data: an experimental evaluation of methods for matching documents and of measuring match quality. Political Anal 28:445–68
[Google Scholar]
Mullainathan S, Spiess J. 2017. Machine learning: an applied econometric approach. J. Econ. Perspect. 31:87–106
[Google Scholar]
Murphy KP. 2012. Machine Learning: A Probabilistic Perspective Cambridge, MA: MIT Press
[Google Scholar]
Nelson LK. 2017. Computational grounded theory: a methodological framework. Sociol. Methods Res. 49:3–42
[Google Scholar]
Ng A, Jordan M, Weiss Y 2002. On spectral clustering: analysis and an algorithm. Advances in Neural Information Processing Systems T Dietterich, S Becker, Z Ghahramani 849–56 Cambridge, MA: MIT Press
[Google Scholar]
Nielsen RA. 2017. Deadly Clerics: Blocked Ambition and the Paths to Jihad Cambridge, UK: Cambridge Univ. Press
[Google Scholar]
Papadogeorgou G, Imai K, Lyall J, Li F. 2020. Causal inference with spatio-temporal data: estimating the effects of airstrikes on insurgent violence in Iraq. arXiv:2003.13555 [stat.ME]
Park HS, Jun CH. 2009. A simple and fast algorithm for K-medoids clustering. Expert Syst. Appl. 36:3336–41
[Google Scholar]
Quinn K, Monroe BL, Colaresi M, Crespin MH, Radev DR. 2010. How to analyze political attention with minimal assumptions and costs. Am. J. Political Sci. 54:209–28
[Google Scholar]
Rashkin H, Choi E, Jang JY, Volkova S, Choi Y 2017. Truth of varying shades: analyzing language in fake news and political fact-checking. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing H Rashkin, E Choi, JY Jang, S Volkova, Y Choi 2931–37 Stroudsburg, PA: Assoc. Comput. Linguist.
[Google Scholar]
Ratkovic M, Tingley D. 2021. Estimation and inference on nonlinear and heterogeneous effects Work. Pap., Harvard Univ Cambridge, MA: https://scholar.harvard.edu/files/dtingley/files/mdei.pdf
[Google Scholar]
Roberts ME, Stewart BM, Airoldi EM. 2016. A model of text for experimentation in the social sciences. J. Am. Stat. Assoc. 111:988–1003
[Google Scholar]
Roberts ME, Stewart BM, Nielsen RA. 2020. Adjusting for confounding with text matching. Am. J. Political Sci. 64:887–903
[Google Scholar]
Roberts ME, Stewart BM, Tingley D, Lucas C, Leder-Luis J et al. 2014. Structural topic models for open-ended survey responses. Am. J. Political Sci. 58:1064–82
[Google Scholar]
Rosenthal H, Poole K. 1985. A spatial model for legislative roll call analysis. Am. J. Political Sci. 29:357–84
[Google Scholar]
Rudin C. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Machine Intel. 1:206–15
[Google Scholar]
Russakovsky O, Deng J, Su H, Krause J, Satheesh S et al. 2015. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis 115:211–52
[Google Scholar]
Salganik M. 2018. Bit by Bit: Social Research in the Digital Age Princeton, NJ: Princeton Univ. Press
[Google Scholar]
Shawe-Taylor J, Cristianini N. 2004. Kernel Methods for Pattern Analysis New York: Cambridge Univ. Press
[Google Scholar]
Shor B, McCarty N. 2011. The ideological mapping of American legislatures. Am. Political Sci. Rev. 105:530–51
[Google Scholar]
Slapin JB, Proksch SO. 2008. A scaling model for estimating time-series party positions from texts. Am. J. Political Sci. 52:705–22
[Google Scholar]
Slough T. 2019. On theory and identification: when and why we need theory for causal identification Work. Pap., Dep. Politics, New York Univ. New York, NY:
[Google Scholar]
Stewart BM, Zhukov Y. 2009. Use of force and civil-military relations in Russia: an automated content analysis. Small Wars Insurg 20:319–43
[Google Scholar]
Tausanovitch C, Warshaw C. 2013. Measuring constituent policy preferences in Congress, state legislatures, and cities. J. Politics 75:330–42
[Google Scholar]
Tavory I, Timmermans S. 2014. Abductive Analysis: Theorizing Qualitative Research Chicago: Univ. Chicago Press
[Google Scholar]
Tian T, Zhu J, Qiaoben Y. 2019. Max-margin majority voting for learning from crowds. IEEE Trans. Pattern Anal. Mach. Intell 41:2480–94
[Google Scholar]
Tvinnereim E, Fløttum K. 2015. Explaining topic prevalence in answers to open-ended survey questions about climate change. Nat. Climate Change 5:744–47
[Google Scholar]
Tyler M. 2020. Getting the most out of human coding Work. Pap., Dep. Political Sci., Stanford Univ. Stanford, CA:
[Google Scholar]
Vavreck L. 2009. The Message Matters: The Economy and Presidential Campaigns Princeton, NJ: Princeton Univ. Press
[Google Scholar]
Veitch V, Wang Y, Blei D 2019. Using embeddings to correct for unobserved confounding in networks. Advances in Neural Information Processing Systems H Wallach, H Larochelle, A Beygelzimer, F Alché-Buc, E Fox, R Garnett 13792–802 Red Hook, NY: Curran Assoc.
[Google Scholar]
Visser PS, Krosnick JA, Lavrakas PJ 2000. Survey research. Handbook of Research Methods in Social and Personality Psychology HT Reis, CM Judd 223–52 Cambridge, UK: Cambridge Univ. Press
[Google Scholar]
Wager S, Athey S. 2018. Estimation and inference of heterogeneous treatment effects using random forests. J. Am. Stat. Assoc. 113:1228–42
[Google Scholar]
Warshaw C, Rodden J. 2012. How should we measure district-level public opinion on individual issues?. J. Politics 74:203–19
[Google Scholar]
Williams NW, Casas A, Wilkerson JD. 2020. Images as Data for Social Science Research: An Introduction to Convolutional Neural Nets for Image Classification Cambridge, UK: Cambridge Univ. Press
[Google Scholar]
Wolfson M, Madjd-Sadjadi Z, James P 2004. Identifying national types: a cluster analysis of politics, economics, and conflict. J. Peace Res. 41:607–23
[Google Scholar]
Ying L, Montgomery JM, Stewart BM. 2019. Inferring concepts from topics: towards procedures for validating topics as measures Presented at the 36th Annual Meeting of the Society for Political Methodology (PolMeth XXXVI), July 18–20 Cambridge, MA:
[Google Scholar]