1932

Abstract

This article provides an overview of the methods used for algorithmic text analysis in economics, with a focus on three key contributions. First, we introduce methods for representing documents as high-dimensional count vectors over vocabulary terms, for representing words as vectors, and for representing word sequences as embedding vectors. Second, we define four core empirical tasks that encompass most text-as-data research in economics and enumerate the various approaches that have been taken so far to accomplish these tasks. Finally, we flag limitations in the current literature, with a focus on the challenge of validating algorithmic output.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-economics-082222-074352
2023-09-13
2024-04-30
Loading full text...

Full text loading...

/deliver/fulltext/economics/15/1/annurev-economics-082222-074352.html?itemId=/content/journals/10.1146/annurev-economics-082222-074352&mimeType=html&fmt=ahah

Literature Cited

  1. Adams-Prassl A, Balgova M, Qian M. 2020. Flexible work arrangements in low wage jobs: evidence from job vacancy data IZA Discuss. Pap. 13691 Inst. Labor Econ. Bonn, Ger.:
  2. Adukia A, Eble A, Harrison E, Runesha HB, Szasz T. 2023. What we teach about race and gender: representation in images and text of children's books. Q. J. Econ. In press
    [Google Scholar]
  3. Advani A, Ash E, Cai D, Rasul I. 2021. Race-related research in economics and other social sciences. CEPR Discuss. Pap. 16115 Cent. Econ. Policy Res. London:
    [Google Scholar]
  4. Ahrens M, McMahon M. 2023. Natural language processing for monetary economics: a benchmark Work. Pap. Univ. Oxford Oxford, UK:
  5. Angelico C, Marcucci J, Miccoli M, Quarta F. 2022. Can we measure inflation expectations using Twitter?. J. Econometr. 228:2259–77
    [Google Scholar]
  6. Angrist J, Pischke JS. 2009. Mostly Harmless Econometrics: An Empiricist's Companion Princeton, NJ: Princeton Univ. Press
  7. Apel M, Blix Grimaldi M. 2014. How informative are central bank minutes?. Rev. Econ. 65:153–76
    [Google Scholar]
  8. Arora S, Liang Y, Ma T. 2016. A simple but tough-to-beat baseline for sentence embeddings Paper presented at the 5th International Conference on Learning Representations Toulon, France: Apr. 24–26
  9. Artetxe M, Schwenk H. 2019. Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Trans. Assoc. Comput. Linguist. 7:597–610
    [Google Scholar]
  10. Ash E, Chen D, Naidu S. 2020a. Ideas have consequences: the impact of law and economics on American justice. Work. Pap. 4 Cent. Law Econ., ETH Zurich Zurich, Switz.:
  11. Ash E, Chen DL, Ornaghi A. 2020b. Gender attitudes in the judiciary: evidence from U.S. circuit courts. Work. Pap. Univ. Warwick Coventry, UK:
  12. Ash E, Durante R, Grebenschikova M, Schwarz C. 2021. Visual stereotypes in news media CEPR Discuss. Pap. 16624 Cent. Econ. Policy Res. London:
  13. Ash E, Gauthier G, Widmer P. 2023. Relatio: Text semantics capture political and economic narratives.. Political Anal. In press. https://doi.org/10.1017/pan.2023.8
    [Google Scholar]
  14. Ash E, Jacobs J, MacLeod B, Naidu S, Stammbach D. 2020c. Unsupervised extraction of workplace rights and duties from collective bargaining agreements. Paper presented at the 2nd International Workshop on Mining and Learning in the Legal Domain (MLLD-2020), online, Nov. 17–20
  15. Ash E, Morelli M, Vannoni M. 2020d. More laws, more growth? Evidence from US states. Work. Pap. 15 Cent. Law Econ., ETH Zurich Zurich, Switz.:
  16. Atalay E, Phongthiengtham P, Sotelo S, Tannenbaum D. 2020. The evolution of work in the United States. Am. Econ. J. Appl. Econ. 12:21–34
    [Google Scholar]
  17. Bahdanau D, Cho KH, Bengio Y. 2015. Neural machine translation by jointly learning to align and translate. Paper presented at the 3rd International Conference on Learning Representations San Diego, CA: May 7–9
  18. Baker SR, Bloom N, Davis SJ. 2016. Measuring economic policy uncertainty. Q. J. Econ. 131:41593–636
    [Google Scholar]
  19. Bana SH. 2022. Work2vec: using language models to understand wage premia Work. Pap. Stanford Univ. Stanford, CA:
  20. Bandiera O, Prat A, Hansen S, Sadun R. 2020. CEO behavior and firm performance. J. Political Econ. 128:41325–69
    [Google Scholar]
  21. Beltagy I, Peters ME, Cohan A. 2020. Longformer: the long-document transformer. arXiv:2004.05150 [cs.CL]
  22. Bengio Y, Ducharme R, Vincent P, Janvin C. 2003. A neural probabilistic language model. J. Mach. Learn. Res. 3:1137–55
    [Google Scholar]
  23. Bertrand M, Bombardini M, Fisman R, Hackinen B, Trebbi F. 2021. Hall of mirrors: corporate philanthropy and strategic advocacy. Q. J. Econ. 136:42413–65
    [Google Scholar]
  24. Besley T, Fetzer T, Mueller H. 2020. How big is the media multiplier? Evidence from dyadic news data Unpublished manuscript London Sch. Econ. London:
  25. Biasi B, Ma S. 2022. The education-innovation gap NBER Work. Pap. 29853
  26. Blei DM, Ng AY, Jordan MI. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3:993–1022
    [Google Scholar]
  27. Bloom N, Hassan TA, Kalyani A, Lerner J, Tahoun A. 2021. The diffusion of disruptive technologies NBER Work. Pap. 28999
  28. Bojanowski P, Grave E, Joulin A, Mikolov T. 2017. Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5:135–46
    [Google Scholar]
  29. Boukus E, Rosenberg JV. 2006. The information content of FOMC minutes Work. Pap. Fed. Reserve Bank New York New York:
  30. Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD et al. 2020. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33:1877–901
    [Google Scholar]
  31. Bubeck S, Chandrasekaran V, Eldan R, Gehrke J, Horvitz E et al. 2023. Sparks of artificial general intelligence: early experiments with GPT-4. arXiv:2303.12712 [cs.CL]
  32. Bybee L, Kelly BT, Manela A, Xiu D. 2021. Business news and business cycles NBER Work. Pap. 29344
  33. Byrne D, Goodhead R, McMahon M, Parle C. 2023a. Measuring the temporal dimension of text: an application to policymaker speeches. CEPR Discuss. Pap. 17931 Cent. Econ. Policy Res. London:
    [Google Scholar]
  34. Byrne D, Goodhead R, McMahon M, Parle C. 2023b. The Central Bank crystal ball: temporal information in monetary policy communication. CEPR Discuss. Pap. 17930 Cent. Econ. Policy Res. London:
  35. Cagé J, Hervé N, Viaud ML. 2020. The production of information in an online world. Rev. Econ. Stud. 87:52126–64
    [Google Scholar]
  36. Caliskan A, Bryson JJ, Narayanan A. 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356:6334183–86
    [Google Scholar]
  37. Chang J, Gerrish S, Wang C, Boyd-Graber J, Blei D. 2009. Reading tea leaves: how humans interpret topic models. Adv. Neural Inf. Process. Syst. 22: https://papers.nips.cc/paper_files/paper/2009/file/f92586a25bb3145facd64ab20fd554ff-Paper.pdf
    [Google Scholar]
  38. Chowdhery A, Narang S, Devlin J, Bosma M, Mishra G et al. 2022. PaLM: scaling language modeling with pathways. arXiv:2204.02311 [cs.CL]
  39. Cieslak A, Hansen S, McMahon M, Xiao S. 2021. Policymakers' uncertainty Unpublished manuscript Univ. Coll. London London:
  40. Coase RH. 1960. The problem of social cost. J. Law Econ. 3:1–44
    [Google Scholar]
  41. Davis SJ, Hansen S, Seminario-Amez C. 2020. Firm-level risk exposures and stock returns in the wake of COVID-19 NBER Work. Pap. 27867
  42. Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R. 1990. Indexing by latent semantic analysis. J. Am. Soc. Inform. Sci. 41:6391–407
    [Google Scholar]
  43. Deming D, Kahn LB. 2018. Skill requirements across firms and labor markets: evidence from job postings for professionals. J. Labor Econ. 36:S1S337–69
    [Google Scholar]
  44. Demszky D, Garg N, Voigt R, Zou J, Gentzkow M et al. 2019. Analyzing polarization in social media: method and application to tweets on 21 mass shootings. arXiv:1904.01596 [cs.CL]
  45. Denny MJ, Spirling A. 2018. Text preprocessing for unsupervised learning: why it matters, when it misleads, and what to do about it. Political Anal. 26:2168–89
    [Google Scholar]
  46. Devlin J, Chang MW, Lee K, Toutanova K. 2019. BERT: pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 14171–86 Stroudsburg, PA: ACL
    [Google Scholar]
  47. Ding C, Li T, Peng W. 2006. Nonnegative matrix factorization and probabilistic latent semantic indexing: equivalence, chi-square statistic, and a hybrid method. Proceedings of the 21st National Conference on Artificial Intelligence, Vol. 1342–47. Boston, MA: AAAI
    [Google Scholar]
  48. Djourelova M, Durante R, Martin G. 2021. The impact of online competition on local newspapers: evidence from the introduction of Craigslist CESifo Work. Pap. 9090 CESifo Munich, Ger.:
  49. Draca M, Schwarz C. 2018. How polarized are citizens? Measuring ideology from the ground-up Warwick Econ. Res. Pap. Ser. 1218 Univ. Warwick Coventry, UK:
  50. Enke B. 2020. Moral values and voting. J. Political Econ. 128:103679–729
    [Google Scholar]
  51. Farrell MH, Liang T, Misra S. 2021. Deep neural networks for estimation and inference. Econometrica 89:1181–213
    [Google Scholar]
  52. Fetzer T. 2020. Can workfare programs moderate conflict? Evidence from India. J. Eur. Econ. Assoc. 18:63337–75
    [Google Scholar]
  53. Friedman M, Schwartz AJ. 1963. A Monetary History of the United States: 1867–1960. Princeton, NJ: Princeton Univ. Press
  54. Gallagher RJ, Reing K, Kale D, Ver Steeg G. 2017. Anchored correlation explanation: topic modeling with minimal domain knowledge. Trans. Assoc. Comput. Linguist. 5:529–42
    [Google Scholar]
  55. Gennaro G, Ash E. 2022. Emotion and reason in political language. Econ. J. 132:6431037–59
    [Google Scholar]
  56. Gentzkow M, Kelly B, Taddy M. 2019a. Text as data. J. Econ. Lit. 57:3535–74
    [Google Scholar]
  57. Gentzkow M, Shapiro JM. 2010. What drives media slant? Evidence from US daily newspapers. Econometrica 78:135–71
    [Google Scholar]
  58. Gentzkow M, Shapiro JM, Taddy M. 2019b. Measuring group differences in high-dimensional choices: method and application to congressional speech. Econometrica 87:41307–40
    [Google Scholar]
  59. Gilardi F, Alizadeh M, Kubli M. 2023. ChatGPT outperforms crowd-workers for text-annotation tasks. arXiv:2303.15056 [cs.CL]
  60. Goldberg Y. 2017. Neural Network Methods for Natural Language Processing San Rafael, CA: Morgan & Claypool
  61. Griffiths TL, Steyvers M. 2004. Finding scientific topics. PNAS 101:(Suppl. 1):5228–35
    [Google Scholar]
  62. Grimmer J, Roberts ME, Stewart BM. 2022. Text as Data: A New Framework for Machine Learning and the Social Sciences Princeton, NJ: Princeton Univ. Press
  63. Grimmer J, Stewart BM. 2013. Text as data: the promise and pitfalls of automatic content analysis methods for political texts. Political Anal. 21:3267–97
    [Google Scholar]
  64. Hanley KW, Hoberg G. 2019. Dynamic interpretation of emerging risks in the financial sector. Rev. Financ. Stud. 32:124543–603
    [Google Scholar]
  65. Hansen S, Lambert PJ, Bloom N, Davis SJ, Sadun R, Taska B. 2023. Remote work across jobs, companies, and space NBER Work. Pap. 31007
  66. Hansen S, McMahon M. 2016. Shocking language: understanding the macroeconomic effects of central bank communication. J. Int. Econ. 99:S114–33
    [Google Scholar]
  67. Hansen S, McMahon M, Prat A. 2018. Transparency and deliberation within the FOMC: a computational linguistics approach. Q. J. Econ. 133:2801–70
    [Google Scholar]
  68. Hansen S, Ramdas T, Sadun R, Fuller J. 2021. The demand for executive skills NBER Work. Pap. 28959
  69. Hassan TA, Hollander S, van Lent L, Tahoun A. 2019. Firm-level political risk: measurement and effects. Q. J. Econ. 134:42135–202
    [Google Scholar]
  70. Hastie T, Tibshirani R, Friedman J. 2009. The Elements of Statistical Learning New York: Springer
  71. Hoberg G, Phillips G. 2010. Product market synergies and competition in mergers and acquisitions: a text-based analysis. Rev. Financ. Stud. 23:103773–811
    [Google Scholar]
  72. Hoberg G, Phillips G. 2016. Text-based network industries and endogenous product differentiation. J. Political Econ. 124:51423–65
    [Google Scholar]
  73. Hofmann T. 1999. Probabilistic latent semantic analysis. Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence289–96. San Francisco, CA: Morgan Kaufmann
    [Google Scholar]
  74. Iaria A, Schwarz C, Waldinger F. 2018. Frontier knowledge and scientific production: evidence from the collapse of international science. Q. J. Econ. 133:2927–91
    [Google Scholar]
  75. Jha M, Liu H, Manela A. 2022. Does finance benefit society? A language embedding approach Work. Pap. Wash. Univ. St. Louis St. Louis, MO:
  76. Joulin A, Grave E, Bojanowski P, Mikolov T. 2016. Bag of tricks for efficient text classification. arXiv:1607.01759 [cs.CL]
  77. Jurafsky D, Martin JH. 2020. Speech and Language Processing Unpublished manuscript. Stanford Univ. Stanford, CA:. , 3rd ed.. https://web.stanford.edu/jurafsky/slp3/ed3book.pdf
    [Google Scholar]
  78. Ke S, Olea JLM, Nesbit J. 2021. Robust machine learning algorithms for text analysis Unpublished manuscript Yale Sch. Manag., Yale Univ. New Haven, CT:
  79. Ke ZT, Kelly B, Xiu D. 2019. Predicting returns with text data NBER Work. Pap. 26186
  80. Kelly B, Manela A, Moreira A. 2021a. Text selection. J. Bus. Econ. Stat. 39:4859–79
    [Google Scholar]
  81. Kelly B, Papanikolaou D, Seru A, Taddy M. 2021b. Measuring technological innovation over the long run. Am. Econ. Rev. Insights 3:3303–20
    [Google Scholar]
  82. Khodak M, Saunshi N, Liang Y, Ma T, Stewart B, Arora S. 2018. A la carte embedding: cheap but effective induction of semantic feature vectors. arXiv:1805.05388 [cs.CL]
  83. Kleinberg J, Mullainathan S. 2019. Simplicity creates inequity: implications for fairness, stereotypes, and interpretability. Proceedings of the 2019 ACM Conference on Economics and Computation807–8. New York: ACM
    [Google Scholar]
  84. Kogan L, Papanikolaou D, Schmidt L, Seegmiller B. 2019. Technology, vintage-specific human capital, and labor displacement: evidence from linking patents with occupations NBER Work. Pap. 29552
  85. Kozlowski AC, Taddy M, Evans JA. 2019. The geometry of culture: analyzing the meanings of class through word embeddings. Am. Sociol. Rev. 84:5905–49
    [Google Scholar]
  86. Larsen VH, Thorsrud LA. 2019. The value of news for economic developments. J. Econometr. 210:1203–18
    [Google Scholar]
  87. Levy O, Goldberg Y. 2014. Dependency-based word embeddings. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Vol. 2302–8 Stroudsburg, PA: ACL
    [Google Scholar]
  88. Li K, Mai F, Shen R, Yan X. 2021. Measuring corporate culture using machine learning. Rev. Financ. Stud. 34:73265–315
    [Google Scholar]
  89. Lippmann Q. 2022. Gender and lawmaking in times of quotas. J. Public Econ. 207:104610
    [Google Scholar]
  90. Liu Y, Ott M, Goyal N, Du J, Joshi M et al. 2019. RoBERTa: a robustly optimized BERT pretraining approach. arXiv:1907.11692 [cs.CL]
  91. Loughran T, Mcdonald B. 2011. When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. J. Finance 66:135–65
    [Google Scholar]
  92. Manning CD, Raghavan P, Schütze H. 2008. Introduction to Information Retrieval New York: Cambridge Univ. Press
  93. Mastrorocco N, Ornaghi A. 2020. Who watches the watchmen? Local news and police behavior in the United States Work. Pap. Dep. Econ., Univ. Warwick Coventry, UK:
  94. McAuliffe J, Blei D. 2007. Supervised topic models. Adv. Neural Inf. Process. Syst. 20: https://papers.nips.cc/paper_files/paper/2007/file/d56b9fc4b0f1be8871f5e1c40c0067e7-Paper.pdf
    [Google Scholar]
  95. Mikolov T, Chen K, Corrado G, Dean J. 2013a. Efficient estimation of word representations in vector space. arXiv:1301.3781 [cs]
  96. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. 2013b. Distributed representations of words and phrases and their compositionality. arXiv:1310.4546 [cs.CL]
  97. Monarch RM. 2021. Human-in-the-Loop Machine Learning: Active Learning and Annotation for Human-Centered AI. Shelter Island, NY: Manning
  98. Mueller H, Rauh C. 2018. Reading between the lines: prediction of political violence using newspaper text. Am. Political Sci. Rev. 112:2358–75
    [Google Scholar]
  99. Ng AY, Jordan MI. 2001. On discriminative versus generative classifiers: a comparison of logistic regression and naive Bayes. Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic841–48. Cambridge, MA: MIT Press
    [Google Scholar]
  100. OpenAI 2023. GPT-4 technical report. arXiv:2303.08774 [cs.CL]
  101. Osnabrügge M, Ash E, Morelli M. 2023. Cross-domain topic classification for political texts. Political Anal. 31:159–80
    [Google Scholar]
  102. Ouyang L, Wu J, Jiang X, Almeida D, Wainwright C et al. 2022. Training language models to follow instructions with human feedback. Adv. Neural Inf. Process. Syst. 35:27730–44
    [Google Scholar]
  103. Pennington J, Socher R, Manning C. 2014. GloVe: global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)1532–43 Stroudsburg, PA: ACL
    [Google Scholar]
  104. Peters ME, Neumann M, Iyyer M, Gardner M, Clark C et al. 2018. Deep contextualized word representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 12227–37 Stroudsburg, PA: ACL
    [Google Scholar]
  105. Phuong M, Hutter M. 2022. Formal algorithms for transformers. arXiv:2207.09238 [cs.LG]
  106. Pryzant R, Shen K, Jurafsky D, Wagner S. 2018. Deconfounded lexicon induction for interpretable social science. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 11615–25 Stroudsburg, PA: ACL
    [Google Scholar]
  107. Radford A, Narasimhan K, Salimans T, Sutskever I. 2018. Improving language understanding by generative pre-training Work. Pap. OpenAI San Francisco, CA:
  108. Ribeiro MT, Singh S, Guestrin C. 2016. Why should I trust you? Explaining the predictions of any classifier. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations97–101 Stroudsburg, PA: ACL
    [Google Scholar]
  109. Roberts ME, Stewart BM, Tingley D, Lucas C, Leder-Luis J et al. 2014. Structural topic models for open-ended survey responses. Am. J. Political Sci. 58:41064–82
    [Google Scholar]
  110. Rodriguez PL, Spirling A. 2022. Word embeddings: what works, what doesn't, and how to tell the difference for applied research. J. Politics 84:1101–15
    [Google Scholar]
  111. Rudin C. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1:5206–15
    [Google Scholar]
  112. Ruiz FJR, Athey S, Blei DM. 2020. SHOPPER: a probabilistic model of consumer choice with substitutes and complements. Ann. Appl. Stat. 14:11–27
    [Google Scholar]
  113. Rydning J. 2021. Worldwide global datasphere and global storagesphere structured and unstructured data forecast, 2021–2025. Market Forecast, Int. Data Corp. Needham, MA:
  114. Sacher S, Battaglia L, Hansen S. 2021. Hamiltonian Monte Carlo for regression with high-dimensional categorical data. arXiv:2107.08112 [econ.EM]
  115. Sanh V, Debut L, Chaumond J, Wolf T. 2020. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv:1910.01108 [cs.CL]
  116. Sedova A, Stephan A, Speranskaya M, Roth B. 2021. Knodle: modular weakly supervised learning with PyTorch. arXiv:2104.11557 [cs.LG]
  117. Shapiro AH, Sudhof M, Wilson DJ. 2022. Measuring news sentiment. J. Econometr. 228:2221–43
    [Google Scholar]
  118. Shen Z, Zhang R, Dell M, Lee BCG, Carlson J, Li W. 2021. LayoutParser: a unified toolkit for deep learning based document image analysis. International Conference on Document Analysis and Recognition131–46. New York: Springer
    [Google Scholar]
  119. Soto PE. 2021. Breaking the Word Bank: measurement and effects of bank level uncertainty. J. Financ. Serv. Res. 59:11–45
    [Google Scholar]
  120. Stammbach D, Antoniak M, Ash E. 2022. Heroes, villains, and victims, and GPT-3–automated extraction of character roles without training data. arXiv:2205.07557 [cs.CL]
  121. Taddy M. 2013. Multinomial inverse regression for text analysis. J. Am. Stat. Assoc. 108:503755–70
    [Google Scholar]
  122. Taddy M. 2015. Distributed multinomial regression. Ann. Appl. Stat. 9:31394–414
    [Google Scholar]
  123. Thorsrud LA. 2020. Words are the new numbers: a newsy coincident index of the business cycle. J. Bus. Econ. Stat. 38:2393–409
    [Google Scholar]
  124. Tiedemann J, Thottingal S. 2020. OPUS-MT—building open translation services for the world. Proceedings of the 22nd Annual Conference of the European Association for Machine Translation479–80. Geneva, Switz: EAMT
    [Google Scholar]
  125. Tipping ME, Bishop CM. 1999. Probabilistic principal component analysis. J. R. Stat. Soc. B 61:3611–22
    [Google Scholar]
  126. Truffa F, Wong A. 2022. Undergraduate gender diversity and direction of scientific research Work. Pap. Stanford Univ. Stanford, CA:
  127. Vafa K, Naidu S, Blei D. 2020. Text-based ideal points. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics5345–57 Stroudsburg, PA: ACL
    [Google Scholar]
  128. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L et al. 2017. Attention is all you need. NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems6000–10. New York: ACM
    [Google Scholar]
  129. Wallach H, Mimno D, McCallum A. 2009. Rethinking LDA: why priors matter. Adv. Neural Inf. Process. Syst. 22: https://papers.nips.cc/paper_files/paper/2009/file/0d0871f0806eae32d30983b62252da50-Paper.pdf
    [Google Scholar]
  130. Wang A, Singh A, Michael J, Hill F, Levy O, Bowman SR. 2018. GLUE: a multi-task benchmark and analysis platform for natural language understanding. arXiv:1804.07461 [cs.CL]
  131. Widmer P, Galletta S, Ash E. 2020. Media slant is contagious. Work. Pap. 14 Cent. Law Econ., ETH Zurich Zurich, Switz.:
  132. Xu FF, Alon U, Neubig G, Hellendoorn VJ. 2022. A systematic evaluation of large language models of code. MAPS 2022: Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming New York: ACM https://doi.org/10.1145/3520312.3534862
    [Google Scholar]
  133. Zaheer M, Guruganesh G, Dubey KA, Ainslie J, Alberti C et al. 2020. Big Bird: transformers for longer sequences. Adv. Neural Inf. Process. Syst. 33:17283–97
    [Google Scholar]
/content/journals/10.1146/annurev-economics-082222-074352
Loading
/content/journals/10.1146/annurev-economics-082222-074352
Loading

Data & Media loading...

Supplemental Material

Supplementary Data

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error