Computational Learning of Syntax

Alexander Clark

doi:10.1146/annurev-linguistics-011516-034008

Computational Learning of Syntax

Alexander Clark¹
View Affiliations Hide Affiliations

Affiliations: Department of Philosophy, King's College London, London WC2R 2LS, United Kingdom; email: [email protected]
Vol. 3:107-123 (Volume publication date January 2017) https://doi.org/10.1146/annurev-linguistics-011516-034008
First published as a Review in Advance on September 07, 2016
© Annual Reviews

Abstract

Learnability has traditionally been considered to be a crucial constraint on theoretical syntax; however, the issues involved have been poorly understood, partly as a result of the lack of simple learning algorithms for various types of formal grammars. Here I discuss the computational issues involved in learning hierarchically structured grammars from strings of symbols alone. The methods involved are based on an abstract notion of the derivational context of a syntactic category, which in the most elementary case of context-free grammars leads to learning algorithms based on a form of traditional distributional analysis. Crucially, these techniques can be extended to work with mildly context-sensitive grammars (and beyond), thus leading to learning methods that can in principle learn classes of grammars that are powerful enough to represent all natural languages. These learning methods require that the syntactic categories of the grammars be visible in a certain technical sense: They must be well characterized either by the sets of symbols that they generate or by the sets of contexts in which they can appear. However, there are still significant gaps between these theoretical results and their direct implementation as models of language acquisition; I discuss how these remaining problems can be overcome.

Keyword(s): computational modeling, language acquisition, syntax

Article metrics loading...

/content/journals/10.1146/annurev-linguistics-011516-034008

2017-01-14

2024-04-28

Full text loading...

/deliver/fulltext/linguistics/3/1/annurev-linguistics-011516-034008.html?itemId=/content/journals/10.1146/annurev-linguistics-011516-034008&mimeType=html&fmt=ahah

Literature Cited

Alishahi A, Stevenson S. 2008. A computational model of early argument structure acquisition. Cogn. Sci. 32:789–834 [Google Scholar]
Angluin D. 1982. Inference of reversible languages. J. ACM 29:741–65 [Google Scholar]
Angluin D. 1987. Learning regular sets from queries and counterexamples. Inf. Comput. 75:87–106 [Google Scholar]
Berwick R. 1984. Strong generative capacity, weak generative capacity, and modern linguistic theories. Comput. Linguist. 10:189–202 [Google Scholar]
Berwick R, Pietroski P, Yankama B, Chomsky N. 2011. Poverty of the stimulus revisited. Cogn. Sci. 35:1207–42 [Google Scholar]
Bisk Y, Hockenmaier J. 2013. An HDP model for inducing combinatory categorial grammars. Trans. Assoc. Comput. Linguist. 1:75–88 [Google Scholar]
Boeckx C. 2006. Linguistic Minimalism Oxford, UK: Oxford Univ. Press
Chater N, Clark A, Goldsmith JA, Perfors A. 2015. Empiricism and Language Learnability Oxford, UK: Oxford Univ. Press
Chater N, Manning C. 2006. Probabilistic models of language processing and acquisition. Trends Cogn. Sci. 10:335–44 [Google Scholar]
Chater N, Vitányi P. 2007. ‘Ideal learning’ of natural language: positive results about learning from positive evidence. J. Math. Psychol. 51:135–63 [Google Scholar]
Chomsky N. 1956. Three models for the description of language. IEEE Trans. Inf. Theory 2:113–24 [Google Scholar]
Chomsky N. 1959. Review of Joshua Greenberg's Essays in Linguistics. Word 15:202–18 [Google Scholar]
Chomsky N. 1986. Knowledge of Language: Its Nature, Origin, and Use New York: Praeger
Clark A. 2006. PAC-learning unambiguous NTS languages. Grammatical Inference: Algorithms and Applications Y Sakakibara, S Kobayashi, K Sato, T Nishino, E Tomita 59–71 Berlin: Springer [Google Scholar]
Clark A. 2013. The syntactic concept lattice: another algebraic theory of the context-free languages?. J. Logic Comput. doi:10.1093/logcom/ext037
Clark A. 2014. Learning trees from strings: a strong learning algorithm for some context-free grammars. J. Mach. Learn. Res. 14:3537–59 [Google Scholar]
Clark A. 2015. Canonical context-free grammars and strong learning: two approaches. Proceedings of the 14th Meeting on the Mathematics of Language (MOL 2015) M Kuhlmann, M Kanazawa, GM Kobele 99–111 Stroudsburg, PA: Assoc. Comput. Linguist. [Google Scholar]
Clark A, Eyraud R. 2007. Polynomial identification in the limit of substitutable context-free languages. J. Mach. Learn. Res. 8:1725–45 [Google Scholar]
Clark A, Kanazawa M, Kobele GM, Yoshinaka R. 2016. Distributional learning of some nonlinear tree grammars. Fundam. Inf. 146:1–39 [Google Scholar]
Clark A, Lappin S. 2009. Another look at indirect negative evidence. Proceedings of the EACL Workshop on Cognitive Aspects of Computational Language Acquisition26–33 Stroudsburg, PA: Assoc. Comput. Linguist.
Clark A, Lappin S. 2011. Linguistic Nativism and the Poverty of the Stimulus Malden, MA: Wiley- Blackwell
Clark A, Thollard F. 2004. PAC-learnability of probabilistic deterministic finite state automata. J. Mach. Learn. Res. 5:473–97 [Google Scholar]
Clark A, Yoshinaka R. 2013. Distributional learning of parallel multiple context-free grammars. Mach. Learn. 96:5–31 [Google Scholar]
Crain S, Pietroski P. 2001. Nature, nurture and universal grammar. Linguist. Philos. 24:139–86 [Google Scholar]
de Groote P. 2001. Towards abstract categorial grammars. Proceedings of the 39th Conference of the Association for Computational Linguistics252–59 Stroudsburg, PA: Assoc. Comput. Linguist.
Denis F, Lemay A, Terlutte A. 2004. Learning regular languages using RFSAs. Theor. Comput. Sci. 313:267–94 [Google Scholar]
Fodor J, Sakas W. 2005. The subset principle in syntax: costs of compliance. J. Linguist. 41:513–70 [Google Scholar]
Ghomeshi J, Jackendo R, Rosen N, Russell K. 2004. Contrastive focus reduplication in English (the salad-salad paper). Nat. Lang. Linguist. Theory 22:307–357 [Google Scholar]
Gibson E, Wexler K. 1994. Triggers. Linguist. Inq. 25:407–54 [Google Scholar]
Gold EM. 1967. Language identification in the limit. Inf. Control 10:447–74 [Google Scholar]
Graf T. 2013. Local and transderivational constraints in syntax and semantics PhD thesis, Dep. Linguist., Univ. Calif., Los Angeles
Harris Z. 1951. Methods in Structural Linguistics Chicago: Univ. Chicago Press
Headden WP III, Johnson M, McClosky D. 2009. Improving unsupervised dependency parsing with richer contexts and smoothing. Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics101–9 Stroudsburg, PA: Assoc. Comput. Linguist.
Heinz J, de la Higuera C, van Zaanen M. 2015. Synthesis Lectures on Human Language Technologies: Grammatical Inference for Computational Linguistics. San Rafael, CA: Morgan & Claypool
Horning JJ. 1969. A study of grammatical inference PhD thesis, Comput. Sci. Dep., Stanford Univ., Stanford, CA
Huybrechts RAC. 1984. The weak inadequacy of context-free phrase structure grammars. Van Periferie naar Kern G de Haan, M Trommelen, W Zonneveld 81–99 Dordrecht, Neth.: Foris [Google Scholar]
Joshi A, Vijay-Shanker K, Weir D. 1990. The convergence of mildly context-sensitive grammar formalisms Tech. rep. MS-CIS-90-01, Dep. Comput. Inf. Sci., Univ. Pa., Philadelphia
Kanazawa M. 1994. Learnable classes of categorial grammars PhD thesis, Comput. Sci. Dep., Stanford Univ., Stanford, CA
Klein D, Manning C. 2004. Corpus-based induction of syntactic structure: models of dependency and constituency. Proceedings of the 42nd Annual Conference of the Association for Computational Linguistics 478 Stroudsburg, PA: Assoc. Comput. Linguist.
Kobele GM. 2006. Generating copies: an investigation into structural identity in language and grammar PhD thesis, Dep. Linguist., Univ. Calif., Los Angeles
Kobele GM. 2011. Minimalist tree languages are closed under intersection with recognizable tree languages. Lecture Notes in Computer Science 6736 Logical Aspects of Computational Linguistics S Pogodalla, JP Prost 129–44 Berlin: Springer [Google Scholar]
Kobele GM. 2015. LF-copying without LF. Lingua 166:B236–59 [Google Scholar]
Kwiatkowski T, Zettlemoyer L, Goldwater S, Steedman M. 2010. Inducing probabilistic CCG grammars from logical form with higher-order unification. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing1223–33 Stroudsburg, PA: Assoc. Comput. Linguist.
Lamb SM. 1961. On the mechanisation of syntactic analysis. Proceedings of the 1961 Conference on Machine Translation of Languages and Applied Language Analysis 2674–85 London: Her Majesty's Station. Off.
MacWhinney B. 1995. The CHILDES Project: Tools for Analyzing Talk Hillsdale, NJ: Erlbaum
Marcus GF. 1993. Negative evidence in language acquisition. Cognition 46:53–85 [Google Scholar]
Marcus MP, Santorini B, Marcinkiewicz MA. 1993. Building a large annotated corpus of English: the Penn Treebank. Comput. Linguist. 19:312–30 [Google Scholar]
Michaelis J. 2001. Transforming linear context-free rewriting systems into minimalist grammars. Logical Aspects of Computational Linguistics P de Groote, G Morrill, C Retoré 228–44 Berlin: Springer [Google Scholar]
Miller P. 1999. Strong Generative Capacity: The Semantics of Linguistic Formalism Stanford, CA: Cent. Study Lang. Inf.
Mossel E, Roch S. 2006. Learning nonsingular phylogenies and hidden Markov models. Ann. Appl. Probab. 16:583–614 [Google Scholar]
Myhill J. 1950. Review of On Syntactical Categories by Yehoshua Bar-Hillel. J. Symb. Logic 15:220 [Google Scholar]
Niyogi P. 2006. The Computational Nature of Language Learning and Evolution Cambridge, MA: MIT Press
Osherson D, Stob M, Weinstein S. 1986. Systems That Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists Cambridge, MA: MIT Press, 1st ed..
Pearl L, Sprouse J. 2012. Computational models of acquisition for islands. Experimental Syntax and Island Effects J Sprouse, N Hornstein 109–31 Cambridge, UK: Cambridge Univ. Press [Google Scholar]
Perfors A, Tenenbaum JB, Regier T. 2011. The learnability of abstract syntactic principles. Cognition 118:306–38 [Google Scholar]
Regier T, Gahl S. 2004. Learning the unlearnable: the role of missing evidence. Cognition 93:147–55 [Google Scholar]
Sakas W, Fodor J. 2001. The structural triggers learner. Language Acquisition and Learnability S Bertolo 172–233 Cambridge, UK: Cambridge Univ. Press [Google Scholar]
Seki H, Matsumura T, Fujii M, Kasami T. 1991. On multiple context-free grammars. Theor. Comput. Sci. 88:191–229 [Google Scholar]
Shibata C, Yoshinaka R. 2013. PAC learning of some subclasses of context-free grammars with basic distributional properties from positive data. Lecture Notes in Computer Science 8139 Algorithmic Learning Theory S Jain, R Munos, F Stephan, T Zeugmann 143–57 Berlin: Springer [Google Scholar]
Shieber S. 1985. Evidence against the context-freeness of natural language. Linguist. Philos. 8:333–43 [Google Scholar]
Shirakawa H, Yokomori T. 1993. Polynomial-time MAT learning of C-deterministic context-free grammars. Trans. Inf. Process. Soc. Jpn. 34:380–90 [Google Scholar]
Spitkovsky VI, Alshawi H, Jurafsky D, Manning CD. 2010. Viterbi training improves unsupervised dependency parsing. Proceedings of the 14th Conference on Computational Natural Language Learning (CoNLL2010)9–17 Stroudsburg, PA: Assoc. Comput. Linguist. [Google Scholar]
Stabler E. 1997. Derivational minimalism. Lecture Notes in Computer Science 1328 Logical Aspects of Computational Linguistics C Retoré 68–95 Berlin: Springer [Google Scholar]
van Zaanen M. 2000. ABL: alignment-based learning. Proceedings of the 18th International Conference on Computational Linguistics (COLING 2000)961–67 Stroudsburg, PA: Assoc. Comput. Linguist. [Google Scholar]
Wells RS. 1947. Immediate constituents. Language 23:81–117 [Google Scholar]
Yang C. 2002. Knowledge and Learning in Natural Language New York: Oxford Univ. Press
Yang C. 2008. The great number crunch. J. Linguist. 44:205–28 [Google Scholar]
Yokomori T. 2003. Polynomial-time identification of very simple grammars from positive data. Theor. Comput. Sci. 298:179–206 [Google Scholar]
Yoshinaka R. 2010. Polynomial-time identification of multiple context-free languages from positive data and membership queries. Proceedings of the 10th International Colloquium on Grammatical Inference230–44 Berlin: Springer [Google Scholar]
Yoshinaka R. 2011. Efficient learning of multiple context-free languages with multidimensional substitutability from positive data. Theor. Comput. Sci. 412:1821–31 [Google Scholar]
Yoshinaka R, Kanazawa M. 2011. Distributional learning of abstract categorial grammars. Lecture Notes in Computer Science 6736 Logical Aspects of Computational Linguistics S Pogodalla, JP Prost 251–66 Berlin: Springer [Google Scholar]

/content/journals/10.1146/annurev-linguistics-011516-034008

Article Type: Review Article

Most Cited Most Cited RSS feed

- Bilingualism, Mind, and Brain
  
  Judith F. Kroll, Paola E. Dussias, Kinsey Bice, and Lauren Perrotti
  
  Vol. 1 (2015), pp. 377–394
- How Nature Meets Nurture: Universal Grammar and Statistical Learning
  
  Jeffrey Lidz, and Annie Gagliardi
  
  Vol. 1 (2015), pp. 333–353
- The Indo-European Homeland from Linguistic and Archaeological Perspectives
  
  David W. Anthony, and Don Ringe
  
  Vol. 1 (2015), pp. 199–219
- Sign Language Typology: The Contribution of Rural Sign Languages
  
  Connie de Vos, and Roland Pfau
  
  Vol. 1 (2015), pp. 265–288
- Correlational Studies in Typological and Historical Linguistics
  
  D. Robert Ladd, Seán G. Roberts, and Dan Dediu
  
  Vol. 1 (2015), pp. 221–241
- Advances in Dialectometry
  
  Martijn Wieling, and John Nerbonne
  
  Vol. 1 (2015), pp. 243–264
- Genetics and the Language Sciences
  
  Simon E. Fisher, and Sonja C. Vernes
  
  Vol. 1 (2015), pp. 289–310
- Ditransitive Constructions
  
  Martin Haspelmath
  
  Vol. 1 (2015), pp. 19–41
- Language Abilities in Neanderthals
  
  Sverker Johansson
  
  Vol. 1 (2015), pp. 311–332
- Diachronic Semantics
  
  Ashwini Deo
  
  Vol. 1 (2015), pp. 179–197
More Less

Annual Review of Linguistics

Volume 3, 2017

Review Article

Free

Computational Learning of Syntax

Abstract

Most Read This Month

Most Cited Most Cited RSS feed

Bilingualism, Mind, and Brain

How Nature Meets Nurture: Universal Grammar and Statistical Learning

The Indo-European Homeland from Linguistic and Archaeological Perspectives

Sign Language Typology: The Contribution of Rural Sign Languages

Correlational Studies in Typological and Historical Linguistics

Advances in Dialectometry

Genetics and the Language Sciences

Ditransitive Constructions

Language Abilities in Neanderthals

Diachronic Semantics