1932

Abstract

Much has been written on the abuse and misuse of statistical methods, including values, statistical significance, and so forth. I present some of the best practices in statistics using a running example data analysis. Focusing primarily on frequentist and Bayesian linear mixed models, I illustrate some defensible ways in which statistical inference—specifically, hypothesis testing using Bayes factors versus estimation or uncertainty quantification—can be carried out. The key is to not overstate the evidence and to not expect too much from statistics. Along the way, I demonstrate some powerful ideas, including the use of simulation to understand the design properties of one's experiment before running it, visualization of data before carrying out a formal analysis, and simulation of data from the fitted model to understand the model's behavior.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-linguistics-031220-010345
2023-01-17
2024-10-05
Loading full text...

Full text loading...

/deliver/fulltext/linguistics/9/1/annurev-linguistics-031220-010345.html?itemId=/content/journals/10.1146/annurev-linguistics-031220-010345&mimeType=html&fmt=ahah

Literature Cited

  1. Baayen RH, Davidson DJ, Bates DM. 2008. Mixed-effects modeling with crossed random effects for subjects and items. J. Mem. Lang. 59:390–412
    [Google Scholar]
  2. Bates DM, Maechler M, Bolker B, Walker S. 2015. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67:1 https://doi.org/10.18637/jss.v067.i01
    [Crossref] [Google Scholar]
  3. Berman M, Jonides J, Lewis RL. 2009. In search of decay in verbal short-term memory. J. Exp. Psychol.: Learn. Mem. Cogn. 35:317–33
    [Google Scholar]
  4. Bürki A, Alario FX, Vasishth S. 2022. When words collide: Bayesian meta-analyses of distractor and target properties in the picture-word interference paradigm. Q. J. Exp. Psychol. In press. https://doi.org/10.1177/17470218221114644
    [Crossref] [Google Scholar]
  5. Bürki A, Elbuy S, Madec S, Vasishth S. 2020. What did we learn from forty years of research on semantic interference? A Bayesian meta-analysis. J. Mem. Lang. 114:104125
    [Google Scholar]
  6. Bürkner PC. 2017. brms: an R package for Bayesian multilevel models using Stan. J. Stat. Softw. 80:1 https://doi.org/10.18637/jss.v080.i01
    [Crossref] [Google Scholar]
  7. Button KS, Ioannidis JP, Mokrysz C, Nosek BA, Flint J et al. 2013. Power failure: why small sample size undermines the reliability of neuroscience. Nat. Rev. Neurosci. 14:365–76
    [Google Scholar]
  8. Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B et al. 2017. Stan: a probabilistic programming language. J. Stat. Softw. 76:1 https://doi.org/10.18637/jss.v076.i01
    [Crossref] [Google Scholar]
  9. Clark H. 1973. The language-as-fixed-effect fallacy: a critique of language statistics in psychological research. J. Verbal Learn. Verbal Behav. 12:335–59
    [Google Scholar]
  10. Cohen J. 1962. The statistical power of abnormal-social psychological research: a review. J. Abnorm. Soc. Psychol. 65:145–53
    [Google Scholar]
  11. Cohen J. 1988. Statistical Power Analysis for the Behavioral Sciences Hillsdale, NJ: Lawrence Erlbaum. , 2nd ed..
    [Google Scholar]
  12. Cox CMM, Keren-Portnoy T, Roepstorff A, Fusaroli R. 2022. A Bayesian meta-analysis of infants' ability to perceive audio-visual congruence for speech. Infancy 27:67–96
    [Google Scholar]
  13. Fedorenko E, Gibson E, Rohde D. 2006. The nature of working memory capacity in sentence comprehension: evidence against domain-specific working memory resources. J. Mem. Lang. 54:541–53
    [Google Scholar]
  14. Gelman A, Carlin JB. 2014. Beyond power calculations: assessing Type S (sign) and Type M (magnitude) errors. Perspect. Psychol. Sci. 9:641–51
    [Google Scholar]
  15. Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. 2014. Bayesian Data Analysis Boca Raton, FL: Chapman and Hall/CRC Press. , 3rd ed..
    [Google Scholar]
  16. Gelman A, Hill J. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models Cambridge, UK: Cambridge Univ. Press
    [Google Scholar]
  17. Gibson E, Desmet T, Grodner D, Watson D, Ko K. 2005. Reading relative clauses in English. Cogn. Linguist. 16:313–53
    [Google Scholar]
  18. Gibson E, Wu HHI. 2013. Processing Chinese relative clauses in context. Lang. Cogn. Proc. 28:125–55
    [Google Scholar]
  19. Gordon PC, Hendrick R, Johnson M 2001. Memory interference during language processing. J. Exp. Psychol.: Learn. Mem. Cogn. 27:61411–23
    [Google Scholar]
  20. Gordon PC, Hendrick R, Johnson M 2004. Effects of noun phrase type on sentence complexity. J. Mem. Lang. 51:97–104
    [Google Scholar]
  21. Grodner D, Gibson E. 2005. Consequences of the serial nature of linguistic input. Cogn. Sci. 29:261–90
    [Google Scholar]
  22. Hedges LV. 1984. Estimation of effect size under nonrandom sampling: the effects of censoring studies yielding statistically insignificant mean differences. J. Educ. Stat. 9:61–85
    [Google Scholar]
  23. Higgins J, Green S. 2008. Cochrane Handbook for Systematic Reviews of Interventions New York: Wiley-Blackwell
    [Google Scholar]
  24. Hoenig JM, Heisey DM. 2001. The abuse of power: the pervasive fallacy of power calculations for data analysis. Am. Stat. 55:19–24
    [Google Scholar]
  25. Hsiao FPF, Gibson E. 2003. Processing relative clauses in Chinese. Cognition 90:3–27
    [Google Scholar]
  26. Hurlbert SH. 1984. Pseudoreplication and the design of ecological field experiments. Ecol. Monogr. 54:187–211
    [Google Scholar]
  27. Ioannidis JP. 2008. Why most discovered true associations are inflated. Epidemiology 19:640–48
    [Google Scholar]
  28. Jäger LA, Engelmann F, Vasishth S. 2017. Similarity-based interference in sentence comprehension: literature review and Bayesian meta-analysis. J. Mem. Lang. 94:316–39
    [Google Scholar]
  29. Jäger LA, Mertzen D, Van Dyke JA, Vasishth S. 2020. Interference patterns in subject-verb agreement and reflexives revisited: a large-sample study. J. Mem. Lang. 111:104063
    [Google Scholar]
  30. Jeffreys H. 1998 (1939). The Theory of Probability Oxford, UK: Oxford Univ. Press
    [Google Scholar]
  31. Just MA, Carpenter PA. 1992. A capacity theory of comprehension: individual differences in working memory. Psychol. Rev. 99:1122–49
    [Google Scholar]
  32. Kass RE, Raftery AE. 1995. Bayes factors. J. Am. Stat. Assoc. 90:773–95
    [Google Scholar]
  33. Kruschke JK. 2010. What to believe: Bayesian methods for data analysis. Trends Cogn. Sci. 14:293–300
    [Google Scholar]
  34. Kruschke JK, Liddell TM. 2018. The Bayesian New Statistics: hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective. Psychon. Bull. Rev. 25:178–206
    [Google Scholar]
  35. Lane DM, Dunlap WP. 1978. Estimating effect size: bias resulting from the significance criterion in editorial decisions. Br. J. Math. Stat. Psychol. 31:107–12
    [Google Scholar]
  36. Lee MD, Wagenmakers EJ. 2014. Bayesian Cognitive Modeling: A Practical Course Cambridge, UK: Cambridge Univ. Press
    [Google Scholar]
  37. Logacev P, Bozkurt Mİ. 2021. Statistical power in response signal paradigm experiments. Proc. Annu. Meeting Cogn. Sci. Soc. 43:2211–17
    [Google Scholar]
  38. Mahowald K, James A, Futrell R, Gibson E. 2016. A meta-analysis of syntactic priming in language production. J. Mem. Lang. 91:5–27
    [Google Scholar]
  39. Moerbeek M, Teerenstra S. 2015. Power Analysis of Trials with Multilevel Data Boca Raton, FL: CRC Press
    [Google Scholar]
  40. Nicenboim B, Roettger TB, Vasishth S. 2018a. Using meta-analysis for evidence synthesis: the case of incomplete neutralization in German. J. Phonet. 70:39–55
    [Google Scholar]
  41. Nicenboim B, Schad D, Vasishth S. 2022. An Introduction to Bayesian Data Analysis for Cognitive Science Forthcoming. https://vasishth.github.io/bayescogsci/book/
    [Google Scholar]
  42. Nicenboim B, Vasishth S, Engelmann F, Suckow K. 2018b. Exploratory and confirmatory analyses in sentence processing: a case study of number interference in German. Cogn. Sci. 42:Suppl. 41075–100
    [Google Scholar]
  43. Nicenboim B, Vasishth S, Rösler F. 2020. Are words pre-activated probabilistically during sentence comprehension? Evidence from new data and a Bayesian random-effects meta-analysis using publicly available data. Neuropsychologia 142:107427
    [Google Scholar]
  44. O'Hagan A, Buck CE, Daneshkhah A, Eiser JR, Garthwaite PH et al. 2006. Uncertain Judgements: Eliciting Experts' Probabilities London: Wiley
    [Google Scholar]
  45. Open Sci. Collab 2015. Estimating the reproducibility of psychological science. Science 349:aac4716
    [Google Scholar]
  46. Pankratz E, Yadav H, Smith G, Vasishth S. 2021. Statistical properties of the speed-accuracy trade-off (SAT) paradigm in sentence processing. Proc. Annu. Meeting Cogn. Sci. Soc. 43:2176–82
    [Google Scholar]
  47. Pinheiro JC, Bates DM. 2000. Mixed-Effects Models in S and S-PLUS New York: Springer-Verlag
    [Google Scholar]
  48. Rouder JN, Haaf JM. 2021. Are there reliable qualitative individual differences in cognition?. J. Cogn. 4:146
    [Google Scholar]
  49. Schad DJ, Betancourt M, Vasishth S 2020a. Toward a principled Bayesian workflow in cognitive science. Psychol. Methods 26:103–26
    [Google Scholar]
  50. Schad DJ, Nicenboim B, Bürkner PC, Betancourt M, Vasishth S 2022a. Workflow techniques for the robust use of Bayes factors. Psychol. Methods https://doi.org/10.1037/met0000472
    [Crossref] [Google Scholar]
  51. Schad DJ, Nicenboim B, Vasishth S. 2022b. Data aggregation can lead to biased inferences in Bayesian linear mixed models. arXiv:2203.02361 [stat.ME]
  52. Schad DJ, Vasishth S, Hohenstein S, Kliegl R 2020b. How to capitalize on a priori contrasts in linear (mixed) models: a tutorial. J. Mem. Lang. 110:104038
    [Google Scholar]
  53. Spiegelhalter DJ, Freedman LS, Parmar MK. 1994. Bayesian approaches to randomized trials. J. R. Stat. Soc. A 157:357–416
    [Google Scholar]
  54. Stan Dev. Team 2022. RStan: the R interface to Stan. Statistical Software https://cran.r-project.org/web/packages/rstan/vignettes/rstan.html
    [Google Scholar]
  55. Vasishth S, Chen Z, Li Q, Guo G. 2013. Processing Chinese relative clauses: evidence for the subject-relative advantage. PLOS ONE 8:10e77006
    [Google Scholar]
  56. Vasishth S, Chopin N, Ryder R, Nicenboim B. 2017. Modelling dependency completion in sentence comprehension as a Bayesian hierarchical mixture process: a case study involving Chinese relative clauses. Proceedings of the 39th Annual Meeting of the Cognitive Science Society1278–83 London: Comput. Found. Cogn.
    [Google Scholar]
  57. Vasishth S, Engelmann F. 2022. Sentence Comprehension as a Cognitive Process: A Computational Approach Cambridge, UK: Cambridge Univ. Press
    [Google Scholar]
  58. Vasishth S, Gelman A. 2021. How to embrace variation and accept uncertainty in linguistic and psycholinguistic data analysis. Linguistics 59:1311–42
    [Google Scholar]
  59. Vasishth S, Nicenboim B. 2016. Statistical methods for linguistic research: foundational ideas – Part I. Lang. Linguist. Compass 10:349–69
    [Google Scholar]
  60. Vasishth S, Nicenboim B, Beckman ME, Li F, Kong EJ. 2018. Bayesian data analysis in the phonetic sciences: a tutorial introduction. J. Phonet. 71:141–61
    [Google Scholar]
  61. Vasishth S, Schad D, Bürki A, Kliegl R 2022a. Linear Mixed Models in Linguistics and Psychology: A Comprehensive Introduction Forthcoming. https://vasishth.github.io/Freq_CogSci/
    [Google Scholar]
  62. Vasishth S, Yadav H, Schad D, Nicenboim B. 2022b. Sample size determination for Bayesian hierarchical models commonly used in psycholinguistics. Comput. Brain Behav. https://doi.org/10.1007/s42113-021-00125-y
    [Crossref] [Google Scholar]
  63. Wagenmakers EJ, Love J, Marsman M, Jamil T, Ly A et al. 2018. Bayesian inference for psychology. Part II: example applications with JASP. Psychonom. Bull. Rev. 25:58–76
    [Google Scholar]
  64. Wasserstein RL, Lazar NA. 2016. The ASA's statement on p-values: context, process, and purpose. Am. Stat. 70:129–33
    [Google Scholar]
  65. Wilke C. 2019. Fundamentals of Data Visualization Sebastopol, CA: O'Reilly
    [Google Scholar]
/content/journals/10.1146/annurev-linguistics-031220-010345
Loading
/content/journals/10.1146/annurev-linguistics-031220-010345
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error