1932

Abstract

In today's global economy, most people all over the world need to speak a second language (L2) for study, work, or social purposes. Assessment of speaking, either in the classroom or as an external exam, is therefore an important task. However, because of its fleeting nature, the assessment of speaking proficiency is difficult. For valid assessment, a speaking test must measure speaking proficiency without construct-irrelevant variance, for instance, due to tasks, raters, and interlocutors. This article begins by bringing together insights from different disciplines to develop a multi-componential construct of speaking proficiency, which includes linguistic and strategic competencies. Because speaking usually takes place in conversation, the ability to take part in interaction, including rapid prediction, is described as part of the speaking construct. Next, the factors that need to be controlled when making a speaking assessment are discussed. Finally, challenges and ideas for future research are briefly described.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-linguistics-030521-052114
2023-01-17
2024-05-23
Loading full text...

Full text loading...

/deliver/fulltext/linguistics/9/1/annurev-linguistics-030521-052114.html?itemId=/content/journals/10.1146/annurev-linguistics-030521-052114&mimeType=html&fmt=ahah

Literature Cited

  1. AERA, APA, NCME. 2014. Standards for Educational and Psychological Testing Washington, DC: Am. Educ. Res. Assoc.
  2. Bachman L, Damböck B. 2018. Language Assessment for Classroom Teachers Oxford, UK: Oxford Univ. Press
  3. Bachman LF, Palmer AS. 1982. The construct validation of some components of communicative proficiency. TESOL Q. 16:4449–65
    [Google Scholar]
  4. Bachman LF, Palmer AS. 1996. Language Testing in Practice: Designing and Developing Useful Language Tests Oxford, UK: Oxford Univ. Press
  5. Bachman LF, Palmer AS. 2010. Language Assessment in Practice: Developing Language Assessments and Justifying Their Use in the Real World Oxford, UK: Oxford Univ. Press
  6. Bates E, D'Amico S, Jacobsen T, Székely A, Andonova E et al. 2003. Timed picture naming in seven languages. Psychon. Bull. Rev. 10:2344–80
    [Google Scholar]
  7. Berry V. 2007. Personality Differences and Oral Test Performance Berlin: Peter Lang
  8. Biggs J. 1996. Enhancing teaching through constructive alignment. High. Educ. 32:3347–64
    [Google Scholar]
  9. Black P, Wiliam D. 2009. Developing the theory of formative assessment. Educ. Assess. Eval. Account. 21:15–31
    [Google Scholar]
  10. Bock K. 1996. Language production: methods and methodologies. Psychon. Bull. Rev. 3:4395–421
    [Google Scholar]
  11. Borsboom D, Mellenbergh GJ, van Heerden J. 2004. The concept of validity. Psychol. Rev. 111:41061–71
    [Google Scholar]
  12. Bradlow AR, Kim M, Blasingame M. 2017. Language-independent talker-specificity in first-language and second-language speech production by bilingual talkers: L1 speaking rate predicts L2 speaking rate. J. Acoustical Soc. Am. 141:2886–99
    [Google Scholar]
  13. Branigan HP, Pickering MJ, Cleland AA. 2000. Syntactic co-ordination in dialogue. Cognition 75:2B13–25
    [Google Scholar]
  14. Brown A, McNamara T. 2004.. “ The devil is in the detail”: researching gender issues in language assessment. TESOL Q. 38:3524–38
    [Google Scholar]
  15. Brown P, Levinson SC. 1987. Politeness: Some Universals in Language Usage Cambridge, UK: Cambridge Univ. Press
  16. Bybee JL 2013. Usage-based theory and exemplar representations of constructions. The Oxford Handbook of Construction Grammar, Vol. 1 T Hoffmann, G Trousdale 1–24. Oxford, UK: Oxford Univ. Press
    [Google Scholar]
  17. Bygate M. 1987. Speaking Oxford, UK: Oxford Univ. Press
  18. Cai H. 2015. Weight-based classification of raters and rater cognition in an EFL speaking test. Lang. Assess. Q. 12:3262–82
    [Google Scholar]
  19. Canale M, Swain M. 1980. Theoretical bases of communicative approaches to second language teaching and testing. Appl. Linguist. 1:11–47
    [Google Scholar]
  20. Celce-Murcia M 2007. Rethinking the role of communicative competence in language teaching. Intercultural Language Use and Language Learning EA Soler, MPS Jordà 41–57. Dordrecht, Neth.: Springer Netherlands
    [Google Scholar]
  21. Chalhoub-Deville M, Deville C 2005. A look back at and forward to what language testers measure. Handbook of Research in Second Language Teaching and Learning E Hinkel 815–32. Abingdon, UK: Routledge. , 1st ed..
    [Google Scholar]
  22. Chalhoub-Deville M, Fulcher G. 2003. The oral proficiency interview: a research agenda. Foreign Lang. Ann. 36:4498–506
    [Google Scholar]
  23. Clark HH. 2002. Speaking in time. Speech Commun. 36:1–25–13
    [Google Scholar]
  24. Council of Europe. 2001. Common European Framework of Reference for Languages: Learning, Teaching, Assessment Strasbourg, Fr.: Council of Europe
  25. Council of Europe. 2018. Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Companion Volume with New Descriptors Strasbourg, Fr.: Council of Europe
  26. Dąbrowska E. 2012. Different speakers, different grammars: individual differences in native language attainment. Linguist. Approaches Bilingualism 2:3219–53
    [Google Scholar]
  27. Davis L. 2009. The influence of interlocutor proficiency in a paired oral assessment. Lang. Test. 26:3367–96
    [Google Scholar]
  28. Davis L, Papageorgiou S. 2021. Complementary strengths? Evaluation of a hybrid human-machine scoring approach for a test of oral academic English. Assess. Educ.: Principles Policy Pract. 28:4437–55
    [Google Scholar]
  29. De Bot K. 1992. A bilingual production model: Levelt's ‘speaking’ model adapted. Appl. Linguist. 13:11–24
    [Google Scholar]
  30. De Jong N, Steinel MP, Arjen F, Florijn AF, Schoonen R, Hulstijn JH 2012a. The effect of task complexity on functional adequacy, fluency and lexical diversity in speaking performances of native and non-native speakers. Dimensions of L2 Performance and Proficiency A Housen, F Kuiken, I Vedder 121–42. Amsterdam: John Benjamins Publ. Co.
    [Google Scholar]
  31. De Jong NH. 2018. Fluency in second language testing: insights from different disciplines. Lang. Assess. Q. 15:3237–54
    [Google Scholar]
  32. De Jong NH. 2021. Assessing language when content matters: language assessment viewpoint Paper presented at EALTA Speaking SIG: Assessing Content When Language Matters online, Nov. 19
  33. De Jong NH, Groenhout R, Schoonen R, Hulstijn JH. 2015. Second language fluency: speaking style or proficiency? Correcting measures of second language fluency for first language behavior. Appl. Psycholinguist. 36:2223–43
    [Google Scholar]
  34. De Jong NH, Steinel MP, Florijn AF, Schoonen R, Hulstijn JH. 2012b. Facets of speaking proficiency. Stud. Second Lang. Acquis. 34:15–34
    [Google Scholar]
  35. De Jong NH, Steinel MP, Florijn A, Schoonen R, Hulstijn JH. 2013. Linguistic skills and speaking fluency in a second language. Appl. Psycholinguist. 34:5893–916
    [Google Scholar]
  36. Dell GS, Schwartz MF, Martin N, Saffran EM, Gagnon DA. 1997. Lexical access in aphasic and nonaphasic speakers. Psychol. Rev. 104:4801–38
    [Google Scholar]
  37. Dingemanse M, Roberts SG, Baranova J, Blythe J, Drew P et al. 2015. Universal principles in the repair of communication problems. PLOS ONE 10:9e0136100
    [Google Scholar]
  38. Dörnyei Z, Kormos J. 1998. Problem-solving mechanisms in L2 communication: a psycholinguistic perspective. Stud. Second Lang. Acquis. 20:3349–85
    [Google Scholar]
  39. Dörnyei Z, Scott ML. 1997. Communication strategies in a second language: definitions and taxonomies. Lang. Learn. 47:1173–210
    [Google Scholar]
  40. Drew P 2012. Turn design. The Handbook of Conversation Analysis J Sidnell, T Stivers 131–49. Hoboken, NJ: Wiley. , 1st ed..
    [Google Scholar]
  41. Eckes T 2009. Many-facet Rasch measurement. Reference Supplement to the Manual for Relating Language Examinations to the Common European Framework of Reference for Languages: Learning, Teaching, Assessment S Takala, Section H Strasbourg, Fr.: Council of Europe
    [Google Scholar]
  42. Ellis NC 2001. Memory for language. Cognition and Second Language Instruction P Robinson 33–68. Cambridge, UK: Cambridge Univ. Press. , 1st ed..
    [Google Scholar]
  43. Engelhard G. 1994. Examining rater errors in the assessment of written composition with a many-faceted Rasch model. J. Educ. Meas. 31:293–112
    [Google Scholar]
  44. ETS (Educ. Test. Serv.). 2022. TOEFL iBT® Independent Speaking Rubrics Rubric, Educ. Test. Serv. Princeton, NJ: https://www.ets.org/content/dam/ets-org/pdfs/toefl/toefl-ibt-speaking-rubrics.pdf
  45. Fan J, Yan X 2020. Assessing speaking proficiency: a narrative review of speaking assessment research within the argument-based validation framework. Front. Psychol. 11:330
    [Google Scholar]
  46. Fulcher G. 1996. Testing tasks: issues in task design and the group oral. Lang. Test. 13:123–51
    [Google Scholar]
  47. Fulcher G. 2003. Testing Second Language Speaking Harlow, UK: Pearson Longman
  48. Fulcher G. 2012. Assessment literacy for the language classroom. Lang. Assess. Q. 9:2113–32
    [Google Scholar]
  49. Fulcher G. 2015. Assessing second language speaking. Lang. Teach. 48:2198–216
    [Google Scholar]
  50. Galaczi E, Taylor L. 2018. Interactional competence: conceptualisations, operationalisations, and outstanding questions. Lang. Assess. Q. 15:3219–36
    [Google Scholar]
  51. Goh CCM, Burns A. 2012. Teaching Speaking: A Holistic Approach New York: Cambridge Univ. Press
  52. Goldberg A. 2005. Constructions at Work Oxford, UK: Oxford Univ. Press
  53. Grice HP 1975. Logic and conversation. Speech Acts P Cole, JL Morgan 41–58. Leiden, Neth.: Brill
    [Google Scholar]
  54. Griffin ZM, Bock K. 2000. What the eyes say about speaking. Psychol. Sci. 11:4274–79
    [Google Scholar]
  55. Gu L, Davis L, Tao J, Zechner K. 2021. Using spoken language technology for generating feedback to prepare for the TOEFL iBT® test: a user perception study. Assess. Educ.: Principles Policy Pract. 28:158–76
    [Google Scholar]
  56. Han Q. 2016. Rater cognition in L2 speaking assessment: a review of the literature. Working Papers TESOL Appl. Linguist. 16:11–24
    [Google Scholar]
  57. Harsch C. 2014. General language proficiency revisited: current and future issues. Lang. Assess. Q. 11:2152–69
    [Google Scholar]
  58. Hickok G. 2012. Computational neuroanatomy of speech production. Nat. Rev. Neurosci. 13:135–45
    [Google Scholar]
  59. Huettig F, Audring J, Jackendoff R. 2022. A parallel architecture perspective on pre-activation and prediction in language processing. Cognition 224:105050
    [Google Scholar]
  60. Hughes A. 2003. Testing for Language Teachers Cambridge, UK: Cambridge Univ. Press. , 2nd ed..
  61. Hulstijn JH. 2011. Language proficiency in native and nonnative speakers: an agenda for research and suggestions for second-language assessment. Lang. Assess. Q. 8:3229–49
    [Google Scholar]
  62. Hulstijn JH. 2015. Language Proficiency in Native and Non-Native Speakers: Theory and Research Amsterdam: John Benjamins
  63. Hulstijn JH. 2019. An individual-differences framework for comparing nonnative with native speakers: perspectives from BLC theory. Lang. Learn. 69:157–83
    [Google Scholar]
  64. Hymes D 1972. On communicative competence. Sociolinguistics J Pride, J Holmes 263–93. Harmondsworth, UK: Penguin
    [Google Scholar]
  65. IELTS. 2022. Speaking: Band Descriptors Rubric, IELTS London: https://www.ielts.org/-/media/pdfs/speaking-band-descriptors.ashx
  66. In'nami Y, Koizumi R 2016. Task and rater effects in L2 speaking and writing: a synthesis of generalizability studies. Lang. Test. 33:3341–66
    [Google Scholar]
  67. Isaacs T 2016. Assessing speaking. Handbook of Second Language Assessment D Tsagari, J Banerjee 131–46. Berlin: DeGruyter Mouton
    [Google Scholar]
  68. Isaacs T. 2018. Shifting sands in second language pronunciation teaching and assessment research and practice. Lang. Assess. Q. 15:3273–93
    [Google Scholar]
  69. Isbell DR, Kremmel B. 2020. Test review: current options in at-home language proficiency tests for making high-stakes decisions. Lang. Test. 37:4600–19
    [Google Scholar]
  70. Iwashita N, Brown A, McNamara T, O'Hagan S. 2008. Assessed levels of second language speaking proficiency: how distinct?. Appl. Linguist. 29:124–49
    [Google Scholar]
  71. Jeon EH, In'nami Y, Koizumi R. 2022. L2 speaking and its external correlates: a meta-analysis. Understanding L2 Proficiency EH Jeon, Y In'nami 339–67. Amsterdam: John Benjamins
    [Google Scholar]
  72. Johnson M. 2001. The Art of Non-Conversation: A Reexamination of the Validity of the Oral Proficiency Interview New Haven, CT: Yale Univ. Press
  73. Kahng J. 2020. Explaining second language utterance fluency: contribution of cognitive fluency and first language utterance fluency. Appl. Psycholinguist. 41:2457–80
    [Google Scholar]
  74. Kane MT. 2013. Validating the interpretations and uses of test scores. J. Educ. Meas. 50:11–73
    [Google Scholar]
  75. Kang O, Rubin D, Kermad A. 2019. The effect of training and rater differences on oral proficiency assessment. Lang. Test. 36:4481–504
    [Google Scholar]
  76. Kasper G, Ross SJ 2013. Assessing second language pragmatics: an overview and introductions. Assessing Second Language Pragmatics SJ Ross, G Kasper 1–40. London: Palgrave Macmillan UK
    [Google Scholar]
  77. Kempen G, Hoenkamp E. 1987. An incremental procedural grammar for sentence formulation. Cogn. Sci. 11:2201–58
    [Google Scholar]
  78. Khabbazbashi N, Galaczi ED. 2020. A comparison of holistic, analytic, and part marking models in speaking assessment. Lang. Test. 37:3333–60
    [Google Scholar]
  79. Kidd E, Donnelly S, Christiansen MH. 2018. Individual differences in language acquisition and processing. Trends Cogn. Sci. 22:2154–69
    [Google Scholar]
  80. Kormos J. 2006. Speech Production and Second Language Acquisition New York: Routledge
  81. Kramsch C. 1986. From language proficiency to interactional competence. Mod. Lang. J. 70:4366–72
    [Google Scholar]
  82. Kuiken F, Vedder I. 2022. Measurement of functional adequacy in different learning contexts: rationale, key issues, and future perspectives. TASK 2:18–32
    [Google Scholar]
  83. Levelt WJM, Roelofs A, Meyer AS. 1999. A theory of lexical access in speech production. Behav. Brain Sci. 22:11–38
    [Google Scholar]
  84. Levinson SC. 2016. Turn-taking in human communication – origins and implications for language processing. Trends Cogn. Sci. 20:16–14
    [Google Scholar]
  85. May L. 2009. Co-constructed interaction in a paired speaking test: the rater's perspective. Lang. Test. 26:3397–421
    [Google Scholar]
  86. McNamara TF. 1996. Measuring Second Language Performance London: Longman
  87. Messick S 1989. Validity. Educational Measurement RL Linn 13–103. Washington, DC: Am. Counc. Educ. , 3rd ed..
    [Google Scholar]
  88. Morsella E, Miozzo M. 2002. Evidence for a cascade model of lexical access in speech production. J. Exp. Psychol. Learn. Mem. Cogn. 28:3555–63
    [Google Scholar]
  89. Nakatani Y. 2010. Identifying strategies that facilitate EFL learners’ oral communication: a classroom study using multiple data collection procedures. Mod. Lang. J. 94:1116–36
    [Google Scholar]
  90. Nakatsuhara F. 2011. Effects of test-taker characteristics and the number of participants in group oral tests. Lang. Test. 28:4483–508
    [Google Scholar]
  91. North B 2002. Developing descriptor scales of language proficiency for the CEF common reference levels. Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Case Studies JC Alderson 87–105. Strasbourg, Fr: Council of Europe
    [Google Scholar]
  92. Ockey GJ. 2009. The effects of group members’ personalities on a test taker's L2 group oral discussion test scores. Lang. Test. 26:2161–86
    [Google Scholar]
  93. Ockey GJ, Chukharev-Hudilainen E. 2021. Human versus computer partner in the paired oral discussion test. Appl. Linguist. 42:5924–44
    [Google Scholar]
  94. O'Sullivan B. 2002. Learner acquaintanceship and oral proficiency test pair-task performance. Lang. Test. 19:3277–95
    [Google Scholar]
  95. O'Sullivan B. 2004. Modelling factors affecting oral language test performance: a large-scale empirical study. European Language Testing in a Global Context. Studies in Language Testing, Vol. 18 M Milanovic, CJ Weir 129–42. Cambridge, UK: Cambridge Univ. Press/Cambridge ESOL
    [Google Scholar]
  96. Pallotti G. 2009. CAF: defining, refining and differentiating constructs. Appl. Linguist. 30:4590–601
    [Google Scholar]
  97. Pallotti G. 2020. Measuring complexity, accuracy, and fluency (CAF). The Routledge Handbook of Second Language Acquisition and Language Testing201–10. Abingdon, UK: Routledge
    [Google Scholar]
  98. Pawley A, Syder FH 1983. Two puzzles for linguistic theory: nativelike selection and nativelike fluency. Language and Communication JC Richards, RW Schmidt 191–226. Abingdon, UK: Routledge
    [Google Scholar]
  99. Pekarek Doehler S, Pochon-Berger E 2015. The development of L2 interactional competence: evidence from turn-taking organization, sequence organization, repair organization and preference organization. Usage-Based Perspectives on Second Language Learning T Cadierno, SW Eskildsen 233–68. Berlin: De Gruyter Mouton
    [Google Scholar]
  100. Pickering MJ, Garrod S. 2004. Toward a mechanistic psychology of dialogue. Behav. Brain Sci. 27:2169–90
    [Google Scholar]
  101. Pickering MJ, Garrod S. 2013. An integrated theory of language production and comprehension. Behav. Brain Sci. 36:4329–47
    [Google Scholar]
  102. Plough I, Banerjee J, Iwashita N. 2018. Interactional competence: genie out of the bottle. Lang. Test. 35:3427–45
    [Google Scholar]
  103. Roever C, Ikeda N. 2022. What scores from monologic speaking tests can(not) tell us about interactional competence. Lang. Test. 39:17–29
    [Google Scholar]
  104. Roever C, Kasper G. 2018. Speaking in turns and sequences: interactional competence as a target construct in testing speaking. Lang. Test. 35:3331–55
    [Google Scholar]
  105. Sacks H, Schegloff EA, Jefferson G 1974. A simplest systematics for the organization of turn-taking for conversation. Language 50:4696–735
    [Google Scholar]
  106. Searle JR. 1976. A classification of illocutionary acts. Lang. Soc. 5:11–23
    [Google Scholar]
  107. Segalowitz N. 2010. Cognitive Bases of Second Language Fluency Abingdon, UK: Routledge
  108. Skehan P. 1998. A Cognitive Approach to Language Learning Oxford, UK: Oxford Univ. Press
  109. Stivers T, Enfield NJ, Brown P, Englert C, Hayashi M et al. 2009. Universals and cultural variation in turn-taking in conversation. PNAS 106:2610587–92
    [Google Scholar]
  110. Taylor L, Galaczi ED. 2011. Scoring validity. Examining Speaking: Research and Practice in Assessing Second Language Speaking, Vol. 30171–233. Cambridge, UK: Cambridge Univ. Press
    [Google Scholar]
  111. Taylor L, Wigglesworth G. 2009. Are two heads better than one? Pair work in L2 assessment contexts. Lang. Test. 26:3325–39
    [Google Scholar]
  112. Timpe-Laughlin V, Youn SJ 2020. Measuring L2 pragmatics. The Routledge Handbook of Second Language Acquisition and Language Testing P Winke, T Brunfaut 254–64. Abingdon, UK: Routledge
    [Google Scholar]
  113. Van Batenburg ESL, Oostdam RJ, Van Gelderen AJS, De Jong NH. 2018. Measuring L2 speakers’ interactional ability using interactive speech tasks. Lang. Test. 35:175–100
    [Google Scholar]
  114. Van Moere A. 2012. A psycholinguistic approach to oral language assessment. Lang. Test. 29:3325–44
    [Google Scholar]
  115. Vogt K, Tsagari D. 2014. Assessment literacy of foreign language teachers: findings of a European study. Lang. Assess. Q. 11:4374–402
    [Google Scholar]
  116. Wall D, Horák T. 2011. The impact of changes in the TOEFL® exam on teaching in a sample of countries in Europe: Phase 3, the role of the coursebook. Phase 4, describing change. ETS Res. Rep. Ser. https://doi.org/10.1002/j.2333-8504.2011.tb02277.x
    [Crossref] [Google Scholar]
  117. Weigle SC. 2002. Assessing Writing Cambridge, UK: Cambridge Univ. Press
  118. Winke P, Gass S, Myford C. 2013. Raters’ L2 background as a potential source of bias in rating oral performance. Lang. Test. 30:2231–52
    [Google Scholar]
  119. Woodward-Kron R, Elder C. 2016. A comparative discourse study of simulated clinical roleplays in two assessment contexts: validating a specific-purpose language test. Lang. Test. 33:2251–70
    [Google Scholar]
  120. Xi X, Mollaun P. 2006. Investigating the utility of analytic scoring for the TOEFL Academic Speaking Test (TAST). ETS Res. Rep. Ser. https://doi.org/10.1002/j.2333-8504.2006.tb02013.x
    [Crossref] [Google Scholar]
  121. Young RF 2011. Interactional competence in language learning, teaching, and testing. Handbook of Research in Second Language Teaching and LearningVol. 2ed. E Hinkelpp. 426–43 Abingdon, UK: Routledge
    [Google Scholar]
/content/journals/10.1146/annurev-linguistics-030521-052114
Loading
/content/journals/10.1146/annurev-linguistics-030521-052114
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error