1932

Abstract

Social scientists are now in an era of data abundance, and machine learning tools are increasingly used to extract meaning from data sets both massive and small. We explain how the inclusion of machine learning in the social sciences requires us to rethink not only applications of machine learning methods but also best practices in the social sciences. In contrast to the traditional tasks for machine learning in computer science and statistics, when machine learning is applied to social scientific data, it is used to discover new concepts, measure the prevalence of those concepts, assess causal effects, and make predictions. The abundance of data and resources facilitates the move away from a deductive social science to a more sequential, interactive, and ultimately inductive approach to inference. We explain how an agnostic approach to machine learning methods focused on the social science tasks facilitates progress across a wide range of questions.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-polisci-053119-015921
2021-05-11
2024-03-29
Loading full text...

Full text loading...

/deliver/fulltext/polisci/24/1/annurev-polisci-053119-015921.html?itemId=/content/journals/10.1146/annurev-polisci-053119-015921&mimeType=html&fmt=ahah

Literature Cited

  1. Acharya A, Bansak K, Hainmueller J. 2021. Combining outcome-based and preference-based matching: the g-constrained priority mechanism. Political Anal In press
    [Google Scholar]
  2. Ahlquist JS, Breunig C. 2012. Model-based clustering and typologies in the social sciences. Political Anal 20:92–112
    [Google Scholar]
  3. Airoldi EM, Blei DM, Fienberg SE, Xing EP. 2008. Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9:1981–2014
    [Google Scholar]
  4. Aronow PM, Miller BT. 2019. Foundations of Agnostic Regression Cambridge, UK: Cambridge Univ. Press
  5. Ashworth S, Berry CR, De Mesquita EB. 2015. All else equal in theory and data (big or small). PS: Political Sci. Politics 48:89–94
    [Google Scholar]
  6. Athey S, Imbens GW 2016. Recursive partitioning for heterogeneous causal effects. PNAS 113:7353–60
    [Google Scholar]
  7. Athey S, Imbens GW. 2019. Machine learning methods that economists should know about. Annu. Rev. Econ. 11:685–725
    [Google Scholar]
  8. Barberá P, Boydstun AE, Linn S, McMahon R, Nagler J. 2021. Automated text classification of news articles: a practical guide. Political Anal 29:19–42
    [Google Scholar]
  9. Baumer EPS, Mimno D, Guha S, Quan E, Gay GK. 2017. Comparing grounded theory and topic modeling: extreme divergence or unlikely convergence?. J. Assoc. Inform. Sci. Technol. 68:1397–410
    [Google Scholar]
  10. Beck N, King G, Zeng L. 2000. Improving quantitative studies of international conflict. Am. Political Sci. Rev. 94:21–36
    [Google Scholar]
  11. Benjamin R. 2019. Race After Technology: Abolitionist Tools for the New Jim Code Cambridge, UK: Wiley
  12. Benoit K, Conway D, Lauderdale B, Laver M, Mikhaylov S. 2016. Crowd-sourced text analysis: reproducible and agile production of political data. Am. Political Sci. Rev. 110:278–95
    [Google Scholar]
  13. Bisbee J. 2019. BARP: improving Mister P using Bayesian additive regression trees. Am. Political Sci. Rev. 113:1060–65
    [Google Scholar]
  14. Bishop C. 2006. Pattern Recognition and Machine Learning New York: Springer
  15. Blaydes L, Grimmer J. 2020. Political cultures: measuring values heterogeneity. Political Sci. Res. Methods 8:571–79
    [Google Scholar]
  16. Blaydes L, Linzer DA. 2008. The political economy of women's support for fundamentalist Islam. World Politics 60:576–609
    [Google Scholar]
  17. Blei DM, Ng AY, Jordan MI. 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3:993–1022
    [Google Scholar]
  18. Bonica A. 2013. Ideology and interests in the political marketplace. Am. J. Political Sci. 57:294–311
    [Google Scholar]
  19. Breiman L. 2001. Statistical modeling: the two cultures. Stat. Sci. 16:199–215
    [Google Scholar]
  20. Carlson D, Montgomery JM. 2017. A pairwise comparison framework for fast, flexible, and reliable human coding of political texts. Am. Political Sci. Rev. 111:835–43
    [Google Scholar]
  21. Chang J, Gerrish S, Wang C, Boyd-Graber JL, Blei DM. 2009. Reading tea leaves: how humans interpret topic models. Proceedings of the 22nd International Conference on Neural Information Processing Systems288–96 Red Hook, NY: Curran Assoc.
    [Google Scholar]
  22. Chatman JA, Flynn FJ. 2005. Full-cycle micro-organizational behavior research. Organ. Sci. 16:434–47
    [Google Scholar]
  23. Chen JKT, Valliant RL, Elliott MR. 2019. Calibrating non-probability surveys to estimated control totals using LASSO, with an application to political polling. J. R. Stat. Soc. Ser. C Appl. Stat. 68:657–81
    [Google Scholar]
  24. Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C et al. 2017. Double/debiased machine learning for treatment and structural parameters. Econom. J. 21:C1–68
    [Google Scholar]
  25. Clinton J, Jackman S, Rivers D. 2004. The statistical analysis of roll call data. Am. Political Sci. Rev. 98:355–70
    [Google Scholar]
  26. D'Amour A, Ding P, Feller A, Lei L, Sekhon J. 2020. Overlap in observational studies with high-dimensional covariates. J. Econom. 221:64454
    [Google Scholar]
  27. Dawid AP, Skene AM. 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. J. R. Stat. Soc. Ser. C Appl. Stat. 28:20–28
    [Google Scholar]
  28. de Marchi S, Stewart BM 2020. Computational and machine learning models: the necessity of connecting theory and empirics. SAGE Handbook of Research Methods in Political Science and International Relations L Curini, R Franzese 289–310 London: SAGE
    [Google Scholar]
  29. Denny MJ, Spirling A. 2018. Text preprocessing for unsupervised learning: why it matters, when it misleads, and what to do about it. Political Anal 26:168–89
    [Google Scholar]
  30. Donoho D. 2017. 50 years of data science. J. Comput. Graph. Stat. 26:745–66
    [Google Scholar]
  31. Dorie V, Hill J, Shalit U, Scott M, Cervone D et al. 2019. Automated versus do-it-yourself methods for causal inference: lessons learned from a data analysis competition. Stat. Sci. 34:43–68
    [Google Scholar]
  32. Efron B, Gong G. 1983. A leisurely look at the bootstrap, the jackknife, and cross-validation. Am. Stat. 37:36–48
    [Google Scholar]
  33. Egami N, Fong CJ, Grimmer J, Roberts ME, Stewart BM. 2018. How to make causal inferences using texts. arXiv:1802.02163 [stat.ML]
  34. Erosheva EA, Fienberg SE, Joutard C. 2007. Describing disability through individual-level mixture models for multivariate binary data. Ann. Appl. Stat. 1:502–37
    [Google Scholar]
  35. Fong C, Grimmer J. 2016. Discovery of treatments from text corpora. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)1600–9 Stroudsburg, PA: Assoc. Comput. Ling.
    [Google Scholar]
  36. Fong CJ, Grimmer J. 2020. Causal inference with latent treatments Work. Pap., Dep. Political Sci., Stanford Univ. Stanford, CA:
  37. Fraley C. 1998. Algorithms for model-based Gaussian hierarchical clustering. SIAM J. Sci. Comput. 20:270–81
    [Google Scholar]
  38. Fraley C, Raftery A. 2002. Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97:611
    [Google Scholar]
  39. Frey BJ, Dueck D. 2007. Clustering by passing messages between data points. Science 315:972–76
    [Google Scholar]
  40. Gentzkow M, Shapiro JM. 2008. Competition and truth in the market for news. J. Econ. Perspect. 22:133–54
    [Google Scholar]
  41. Gentzkow M, Shapiro JM, Taddy M. 2019. Measuring polarization in high-dimensional data: method and application to congressional speech. Econometrica 87:1307–40
    [Google Scholar]
  42. Ghitza Y, Gelman A. 2013. Deep interactions with MRP: election turnout and voting patterns among small electoral subgroups. Am. J. Political Sci. 57:762–76
    [Google Scholar]
  43. Glaser BG, Strauss AL. 1967. The Discovery of Grounded Theory: Strategies for Qualitative Research New York: Aldine de Gruyter
  44. Goodfellow I, Bengio Y, Courville A, Bengio Y. 2016. Deep Learning Cambridge, MA: MIT Press
  45. Grimmer J. 2010. A Bayesian hierarchical topic model for political texts: measuring expressed agendas in Senate press releases. Political Anal 18:1–35
    [Google Scholar]
  46. Grimmer J, Messing S, Westwood SJ. 2017. Estimating heterogeneous treatment effects and the effects of heterogeneous treatments with ensemble methods. Political Anal 25:413–34
    [Google Scholar]
  47. Grimmer J, Westwood SJ, Messing S. 2014. The Impression of Influence: Legislator Communication, Representation, and Democratic Accountability Princeton, NJ: Princeton Univ. Press
  48. Hainmueller J, Hazlett C. 2014. Kernel regularized least squares: reducing misspecification bias with a flexible and interpretable machine learning approach. Political Anal 22:143–68
    [Google Scholar]
  49. Hansen MH, Kooperberg C, Truong YK, Stone CJ. 1997. Polynomial splines and their tensor products in extended linear modeling: 1994 Wald memorial lecture. Ann. Stat. 25:1371–470
    [Google Scholar]
  50. Hastie T, Tibshirani R, Friedman J. 2013. The Elements of Statistical Learning New York: Springer
  51. Hill DW Jr., Jones ZM. 2014. An empirical evaluation of explanations for state repression. Am. Political Sci. Rev. 108:661–87
    [Google Scholar]
  52. Hill JL. 2011. Bayesian nonparametric modeling for causal inference. J. Comput. Graph. Stat. 20:217–40
    [Google Scholar]
  53. Hillard D, Purpura S, Wilkerson J. 2008. Computer-assisted topic classification for mixed-methods social science research. J. Inf. Technol. Politics 4:31–46
    [Google Scholar]
  54. Holland PW. 1986. Statistics and causal inference. J. Am. Stat. Assoc. 81:945–60
    [Google Scholar]
  55. Humphreys M, Sanchez de la Sierra R, Van der Windt P. 2013. Fishing, commitment, and communication: a proposal for comprehensive nonbinding research registration. Political Anal 21:1–20
    [Google Scholar]
  56. Imai K, Ratkovic M. 2013. Estimating treatment effect heterogeneity in randomized program evaluation. Ann. Appl. Stat. 7:443–70
    [Google Scholar]
  57. Imai K, Tingley D. 2012. A statistical method for empirical testing of competing theories. Am. J. Political Sci. 56:218–36
    [Google Scholar]
  58. Jacobi C, Van Atteveldt W, Welbers K. 2016. Quantitative analysis of large amounts of journalistic texts using topic modelling. Digital J 4:89–106
    [Google Scholar]
  59. Jamal AA, Keohane RO, Romney D, Tingley D. 2015. Anti-Americanism and anti-interventionism in Arabic Twitter discourses. Perspect. Politics 13:55–73
    [Google Scholar]
  60. Johansson F, Shalit U, Sontag D 2016. Learning representations for counterfactual inference. 33rd International Conference on Machine Learning, ICML 2016, Vol. 6 KQ Weinberger, MF Balcan 4407–18 New York: Int. Machine Learning Soc.
    [Google Scholar]
  61. Karell D, Freedman M. 2019. Rhetorics of radicalism. Am. Sociol. Rev. 84:726–53
    [Google Scholar]
  62. Kaufman AR, Kraft P, Sen M. 2019. Improving Supreme Court forecasting using boosted decision trees. Political Anal 27:381–87
    [Google Scholar]
  63. Keith KA, Jensen D, O'Connor B. 2020. Text and causal inference: a review of using text to remove confounding from causal estimates. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics5332–44 Stroudsburg, PA: Assoc. Comput. Linguist.
    [Google Scholar]
  64. King G, Keohane RO, Verba S. 1995. The importance of research design in political science. Am. Political Sci. Rev. 89:454–81
    [Google Scholar]
  65. Kingma DP, Welling M. 2019. An introduction to variational autoencoders. Found. Trends Machine Learn. 12:307–92
    [Google Scholar]
  66. Knox D, Lucas C. 2021. A dynamic model of speech for the social sciences. Am. Political Sci. Rev. In press
    [Google Scholar]
  67. Koford K, Poole KT, Rosenthal H. 1991. On dimensionalizing roll call votes in the US Congress. Am. Political Sci. Rev. 85:955–75
    [Google Scholar]
  68. Künzel SR, Sekhon JS, Bickel PJ, Yu B 2019. Metalearners for estimating heterogeneous treatment effects using machine learning. PNAS 116:4156–65
    [Google Scholar]
  69. Lalonde R. 1986. Evaluating the econometric evaluations of training programs. Am. Econ. Rev. 76:604–20
    [Google Scholar]
  70. Lax JR, Phillips JH. 2009. How should we estimate public opinion in the states?. Am. J. Political Sci. 53:107–21
    [Google Scholar]
  71. Levine J, Carmines EG, Sniderman PM. 1999. The empirical dimensionality of racial stereotypes. Public Opin. Q. 63:371–84
    [Google Scholar]
  72. Liberman M. 2010. Fred Jelinek. Comput. Linguist. 36:595–99
    [Google Scholar]
  73. Lieberman ES. 2005. Nested analysis as a mixed-method strategy for comparative research. Am. Political Sci. Rev. 99:435–52
    [Google Scholar]
  74. Lin W. 2013. Agnostic notes on regression adjustments to experimental data: reexamining Freedman's critique. Ann. Appl. Stat. 7:295–318
    [Google Scholar]
  75. Lundberg I, Johnson R, Stewart BM 2021. What is your estimand? Defining the target quantity connects statistical evidence to theory. Am. Sociol. Rev. In press
    [Google Scholar]
  76. McGhee E, Masket S, Shor B, Rogers S, McCarty N 2014. A primary cause of partisanship? Nomination systems and legislator ideology. Am. J. Political Sci. 58:337–51
    [Google Scholar]
  77. Molina M, Garip F. 2019. Machine learning for sociology. Annu. Rev. Sociol. 45:27–45
    [Google Scholar]
  78. Monroe B, Colaresi M, Quinn K. 2008. Fightin’ words: lexical feature selection and evaluation for identifying the content of political conflict. Political Anal 16:372–403
    [Google Scholar]
  79. Montgomery JM, Olivella S. 2018. Tree-based models for political science data. Am. J. Political Sci. 62:729–44
    [Google Scholar]
  80. Mozer R, Miratrix L, Kaufman AR, Anastasopoulos LJ. 2020. Matching with text data: an experimental evaluation of methods for matching documents and of measuring match quality. Political Anal 28:445–68
    [Google Scholar]
  81. Mullainathan S, Spiess J. 2017. Machine learning: an applied econometric approach. J. Econ. Perspect. 31:87–106
    [Google Scholar]
  82. Murphy KP. 2012. Machine Learning: A Probabilistic Perspective Cambridge, MA: MIT Press
  83. Nelson LK. 2017. Computational grounded theory: a methodological framework. Sociol. Methods Res. 49:3–42
    [Google Scholar]
  84. Ng A, Jordan M, Weiss Y 2002. On spectral clustering: analysis and an algorithm. Advances in Neural Information Processing Systems T Dietterich, S Becker, Z Ghahramani 849–56 Cambridge, MA: MIT Press
    [Google Scholar]
  85. Nielsen RA. 2017. Deadly Clerics: Blocked Ambition and the Paths to Jihad Cambridge, UK: Cambridge Univ. Press
  86. Papadogeorgou G, Imai K, Lyall J, Li F. 2020. Causal inference with spatio-temporal data: estimating the effects of airstrikes on insurgent violence in Iraq. arXiv:2003.13555 [stat.ME]
  87. Park HS, Jun CH. 2009. A simple and fast algorithm for K-medoids clustering. Expert Syst. Appl. 36:3336–41
    [Google Scholar]
  88. Quinn K, Monroe BL, Colaresi M, Crespin MH, Radev DR. 2010. How to analyze political attention with minimal assumptions and costs. Am. J. Political Sci. 54:209–28
    [Google Scholar]
  89. Rashkin H, Choi E, Jang JY, Volkova S, Choi Y 2017. Truth of varying shades: analyzing language in fake news and political fact-checking. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing H Rashkin, E Choi, JY Jang, S Volkova, Y Choi 2931–37 Stroudsburg, PA: Assoc. Comput. Linguist.
    [Google Scholar]
  90. Ratkovic M, Tingley D. 2021. Estimation and inference on nonlinear and heterogeneous effects Work. Pap., Harvard Univ Cambridge, MA: https://scholar.harvard.edu/files/dtingley/files/mdei.pdf
  91. Roberts ME, Stewart BM, Airoldi EM. 2016. A model of text for experimentation in the social sciences. J. Am. Stat. Assoc. 111:988–1003
    [Google Scholar]
  92. Roberts ME, Stewart BM, Nielsen RA. 2020. Adjusting for confounding with text matching. Am. J. Political Sci. 64:887–903
    [Google Scholar]
  93. Roberts ME, Stewart BM, Tingley D, Lucas C, Leder-Luis J et al. 2014. Structural topic models for open-ended survey responses. Am. J. Political Sci. 58:1064–82
    [Google Scholar]
  94. Rosenthal H, Poole K. 1985. A spatial model for legislative roll call analysis. Am. J. Political Sci. 29:357–84
    [Google Scholar]
  95. Rudin C. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Machine Intel. 1:206–15
    [Google Scholar]
  96. Russakovsky O, Deng J, Su H, Krause J, Satheesh S et al. 2015. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis 115:211–52
    [Google Scholar]
  97. Salganik M. 2018. Bit by Bit: Social Research in the Digital Age Princeton, NJ: Princeton Univ. Press
  98. Shawe-Taylor J, Cristianini N. 2004. Kernel Methods for Pattern Analysis New York: Cambridge Univ. Press
  99. Shor B, McCarty N. 2011. The ideological mapping of American legislatures. Am. Political Sci. Rev. 105:530–51
    [Google Scholar]
  100. Slapin JB, Proksch SO. 2008. A scaling model for estimating time-series party positions from texts. Am. J. Political Sci. 52:705–22
    [Google Scholar]
  101. Slough T. 2019. On theory and identification: when and why we need theory for causal identification Work. Pap., Dep. Politics, New York Univ. New York, NY:
  102. Stewart BM, Zhukov Y. 2009. Use of force and civil-military relations in Russia: an automated content analysis. Small Wars Insurg 20:319–43
    [Google Scholar]
  103. Tausanovitch C, Warshaw C. 2013. Measuring constituent policy preferences in Congress, state legislatures, and cities. J. Politics 75:330–42
    [Google Scholar]
  104. Tavory I, Timmermans S. 2014. Abductive Analysis: Theorizing Qualitative Research Chicago: Univ. Chicago Press
  105. Tian T, Zhu J, Qiaoben Y. 2019. Max-margin majority voting for learning from crowds. IEEE Trans. Pattern Anal. Mach. Intell 41:248094
    [Google Scholar]
  106. Tvinnereim E, Fløttum K. 2015. Explaining topic prevalence in answers to open-ended survey questions about climate change. Nat. Climate Change 5:744–47
    [Google Scholar]
  107. Tyler M. 2020. Getting the most out of human coding Work. Pap., Dep. Political Sci., Stanford Univ. Stanford, CA:
  108. Vavreck L. 2009. The Message Matters: The Economy and Presidential Campaigns Princeton, NJ: Princeton Univ. Press
  109. Veitch V, Wang Y, Blei D 2019. Using embeddings to correct for unobserved confounding in networks. Advances in Neural Information Processing Systems H Wallach, H Larochelle, A Beygelzimer, F Alché-Buc, E Fox, R Garnett 13792–802 Red Hook, NY: Curran Assoc.
    [Google Scholar]
  110. Visser PS, Krosnick JA, Lavrakas PJ 2000. Survey research. Handbook of Research Methods in Social and Personality Psychology HT Reis, CM Judd 223–52 Cambridge, UK: Cambridge Univ. Press
    [Google Scholar]
  111. Wager S, Athey S. 2018. Estimation and inference of heterogeneous treatment effects using random forests. J. Am. Stat. Assoc. 113:1228–42
    [Google Scholar]
  112. Warshaw C, Rodden J. 2012. How should we measure district-level public opinion on individual issues?. J. Politics 74:203–19
    [Google Scholar]
  113. Williams NW, Casas A, Wilkerson JD. 2020. Images as Data for Social Science Research: An Introduction to Convolutional Neural Nets for Image Classification Cambridge, UK: Cambridge Univ. Press
  114. Wolfson M, Madjd-Sadjadi Z, James P 2004. Identifying national types: a cluster analysis of politics, economics, and conflict. J. Peace Res. 41:607–23
    [Google Scholar]
  115. Ying L, Montgomery JM, Stewart BM. 2019. Inferring concepts from topics: towards procedures for validating topics as measures Presented at the 36th Annual Meeting of the Society for Political Methodology (PolMeth XXXVI), July 18–20 Cambridge, MA:
/content/journals/10.1146/annurev-polisci-053119-015921
Loading
/content/journals/10.1146/annurev-polisci-053119-015921
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error