The -value quantifies the discrepancy between the data and a null hypothesis of interest, usually the assumption of no difference or no effect. A Bayesian approach allows the calibration of -values by transforming them to direct measures of the evidence against the null hypothesis, so-called Bayes factors. We review the available literature in this area and consider two-sided significance tests for a point null hypothesis in more detail. We distinguish simple from local alternative hypotheses and contrast traditional Bayes factors based on the data with Bayes factors based on -values or test statistics. A well-known finding is that the minimum Bayes factor, the smallest possible Bayes factor within a certain class of alternative hypotheses, provides less evidence against the null hypothesis than the corresponding -value might suggest. It is less known that the relationship between -values and minimum Bayes factors also depends on the sample size and on the dimension of the parameter of interest. We illustrate the transformation of -values to minimum Bayes factors with two examples from clinical research.


Article metrics loading...

Loading full text...

Full text loading...


Literature Cited

  1. Bayarri MJ, Benjamin DJ, Berger JO, Sellke TM. 2016. Rejection odds and rejection ratios: a proposal for statistical practice in testing hypotheses. J. Math. Psychol. 72:90–103 [Google Scholar]
  2. Bayarri MJ, Berger JO, Forte A, García-Donato G. 2012. Criteria for Bayesian model choice with application to variable selection. Ann. Stat. 40:1550–77 [Google Scholar]
  3. Benjamin DJ, Berger JO, Johannesson M, Nosek BA, Wagenmakers E-J. et al. 2017. Redefine statistical significance. Nat. Hum. Behav. http://dx.doi.org/10.1038/s41562-017-0189-z [Crossref]
  4. Berger J. 2006. The case for objective Bayesian analysis. Bayesian Anal 1:385–402 [Google Scholar]
  5. Berger JO, Sellke T. 1987. Testing a point null hypothesis: the irreconcilability of P values and evidence (with discussion). J. Am. Stat. Assoc. 82:112–39Derivation of minimum Bayes factors for different classes of alternatives, including symmetric and local alternatives. [Google Scholar]
  6. Bernardo JM, Smith AFM. 2000. Bayesian Theory Chichester, UK: Wiley
  7. Berry DA. 2016. p-Values are not what they're cracked up to be. Am. Stat. 70: http://www.tandfonline.com/doi/full/10.1080/00031305.2016.1154108 [Google Scholar]
  8. Bland M. 2015. An Introduction to Medical Statistics Oxford, UK: Oxford Univ. Press. 4th ed.
  9. Casella G, Berger RL. 1987. Reconciling Bayesian and frequentist evidence in the one-sided testing problem. J. Am. Stat. Assoc. 82:106–11 [Google Scholar]
  10. Cox DR. 2006. Principles of Statistical Inference Cambridge, UK: Cambridge Univ. Press
  11. Cox DR, Donnelly CA. 2011. Principles of Applied Statistics Cambridge, UK: Cambridge Univ. Press
  12. Davidson RR, Lever WE. 1970. The limiting distribution of the likelihood ratio statistic under a class of local alternatives. Sankhya Ser. A 32:209–24 [Google Scholar]
  13. Donahue RMJ. 1999. A note on information seldom reported via the P value. Am. Stat. 53:303–6 [Google Scholar]
  14. Edwards W, Lindman H, Savage LJ. 1963. Bayesian statistical inference for psychological research. Psychol. Rev. 70:193–242A celebrated introduction to the Bayesian paradigm; includes a pioneering section on minimum Bayes factors. [Google Scholar]
  15. Fisher RA. 1958. Statistical Methods for Research Workers Edinburgh: Oliver & Boyd. 13th ed.
  16. Good IJ. 1950. Probability and the Weighing of Evidence London: Griffin
  17. Goodman SN. 1999.a Toward evidence-based medical statistics. 1: The p value fallacy. Ann. Intern. Med. 130:995–1004 [Google Scholar]
  18. Goodman SN. 1999.b Toward evidence-based medical statistics. 2: The Bayes factor. Ann. Intern. Med. 130:1005–13Two papers advocating minimum Bayes factors as an alternative to p-values in medical research. [Google Scholar]
  19. Goodman SN. 2005. P value. Encyclopedia of Biostatistics P Armitage, T Colton 3921–25 Chichester, UK: Wiley. 2nd ed. [Google Scholar]
  20. Goodman SN. 2008. A dirty dozen: twelve p-value misconceptions. Semin. Hematol. 45:135–40 [Google Scholar]
  21. Goodman SN. 2016. Aligning statistical and scientific reasoning. Science 352:1180–81 [Google Scholar]
  22. Greenland S, Poole C. 2013. Living with p values: resurrecting a Bayesian perspective on frequentist statistics. Epidemiology 24:62–68 [Google Scholar]
  23. Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C. et al. 2016. Statistical tests, p values, confidence intervals, and power: a guide to misinterpretations. Eur. J. Epidemiol. 31:337–50 [Google Scholar]
  24. Held L. 2010. A nomogram for p values. BMC Med. Res. Methodol. 10:21 [Google Scholar]
  25. Held L, Gravestock I, Sabanés Bové D. 2016. Objective Bayesian model selection for Cox regression. Stat. Med. 35:5376–90 [Google Scholar]
  26. Held L, Ott M. 2016. How the maximal evidence of p-values against point null hypotheses depends on sample size. Am. Stat. 70:335–41A sample-size adjusted calibration of p-values is proposed. [Google Scholar]
  27. Held L, Sabanés Bové D, Gravestock I. 2015. Approximate Bayesian model selection with the deviance statistic. Stat. Sci. 30:242–57 [Google Scholar]
  28. Hu J, Johnson VE. 2009. Bayesian model selection using test statistics. J. R. Stat. Soc. B 71:143–58 [Google Scholar]
  29. Hung HMJ, O'Neill RT, Bauer P, Kohne K. 1997. The behavior of the p-value when the alternative hypothesis is true. Biometrics 53:11–22 [Google Scholar]
  30. Jeffreys H. 1961. Theory of Probability Oxford, UK: Oxford Univ. Press. , 3rd ed..
  31. Johnson VE. 2005. Bayes factors based on test statistics. J. R. Stat. Soc. B 67:689–701Bayes factors based on test statistics are introduced. [Google Scholar]
  32. Johnson VE. 2008. Properties of Bayes factors based on test statistics. Scand. J. Stat. 35:354–68 [Google Scholar]
  33. Johnson VE. 2016. Comments on the “ASA Statement on Statistical Significance and P-values” and marginally significant p-values. Am. Stat. 70: http://www.tandfonline.com/doi/full/10.1080/00031305.2016.1154108 [Google Scholar]
  34. Johnson VE, Rossell D. 2010. On the use of non-local prior densities in Bayesian hypothesis tests. J. R. Stat. Soc. B 72:143–70 [Google Scholar]
  35. Kass RE, Raftery AE. 1995. Bayes factors. J. Am. Stat. Assoc. 90:773–95 [Google Scholar]
  36. Lee PM. 2004. Bayesian Statistics: An Introduction London: Wiley. , 3rd ed..
  37. Li Y, Clyde MA. 2016. Mixtures of g-priors in generalized linear models. arXiv1503.06913v2 [stat.ME]
  38. Liang F, Paulo R, Molina G, Clyde MA, Berger JO. 2008. Mixtures of g priors for Bayesian variable selection. J. Am. Stat. Assoc. 103:410–23 [Google Scholar]
  39. Lindley DV. 1957. A statistical paradox. Biometrika 44:187–92 [Google Scholar]
  40. Marsman M, Wagenmakers E-J. 2017. Three insights from a Bayesian interpretation of the one-sided p value. Educ. Psychol. Meas. 77:529–39 [Google Scholar]
  41. Matthews JNS. 2006. Introduction to Randomized Controlled Clinical Trials Boca Raton, FL: Chapman & Hall/CRC, 2nd ed..
  42. Matthews R, Wasserstein R, Spiegelhalter D. 2017. The ASA's p-value statement, one year on. Significance 14:38–41 [Google Scholar]
  43. Ott M, Held L. 2017. Bayesian calibration of p-values from Fisher's exact test Tech. Rep., Univ. Zurich
  44. Ramsey F, Schafer D. 2002. The Statistical Sleuth: A Course in Methods of Data Analysis Belmont, CA: Duxbury. , 2nd ed..
  45. Royall RM. 1986. The effect of sample size on the meaning of significance tests. Am. Stat. 40:313–15 [Google Scholar]
  46. Sellke T, Bayarri MJ, Berger JO. 2001. Calibration of p values for testing precise null hypotheses. Am. Stat 55:62–71A comprehensive paper on the −e p log p calibration gives different derivations. [Google Scholar]
  47. Spiegelhalter DJ, Abrams KR, Myles JP. 2004. Bayesian Approaches to Clinical Trials and Health-Care Evaluation New York: Wiley
  48. Stephens M, Balding DJ. 2009. Bayesian statistical methods for genetic association studies. Nat. Rev. Genet. 10:681–90 [Google Scholar]
  49. Steyerberg EW. 2009. Clinical Prediction Models New York: Springer
  50. Tukey JW. 1980. We need both exploratory and confirmatory. Am. Stat. 34:23–25 [Google Scholar]
  51. Vovk VG. 1993. A logic of probability, with application to the foundations of statistics (with discussion and a reply by the author). J. R. Stat. Soc. B 55:317–51 [Google Scholar]
  52. Wagenmakers E-J. 2007. A practical solution to the pervasive problems of p values. Psychon. Bull. Rev. 14:779–804 [Google Scholar]
  53. Wakefield J. 2009. Bayes factors for genome-wide association studies: comparison with p-values. Genet. Epidemiol. 33:79–86 [Google Scholar]
  54. Wang X, George EI. 2007. Adaptive Bayesian criteria in variable selection for generalized linear models. Stat. Sin. 17:667–90 [Google Scholar]
  55. Wasserstein RL, Lazar NA. 2016. The ASA's statement on p-values: context, process, and purpose. Am. Stat. 70:129–33 [Google Scholar]
  56. Yuan Y, Johnson VE. 2008. Bayesian hypothesis tests using nonparametric statistics. Stat. Sin. 18:1185–200 [Google Scholar]
  57. Zellner A. 1986. On assessing prior distributions and Bayesian regression analysis with g-prior distributions. Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti PK Goel, A Zellner 233–43 Amsterdam: North-Holland [Google Scholar]

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error