1932

Abstract

Markov chain Monte Carlo (MCMC) is one of the most useful approaches to scientific computing because of its flexible construction, ease of use, and generality. Indeed, MCMC is indispensable for performing Bayesian analysis. Two critical questions that MCMC practitioners need to address are where to start and when to stop the simulation. Although a great amount of research has gone into establishing convergence criteria and stopping rules with sound theoretical foundation, in practice, MCMC users often decide convergence by applying empirical diagnostic tools. This review article discusses the most widely used MCMC convergence diagnostic tools. Some recently proposed stopping rules with firm theoretical footing are also presented. The convergence diagnostics and stopping rules are illustrated using three detailed examples.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-031219-041300
2020-03-07
2024-04-19
Loading full text...

Full text loading...

/deliver/fulltext/statistics/7/1/annurev-statistics-031219-041300.html?itemId=/content/journals/10.1146/annurev-statistics-031219-041300&mimeType=html&fmt=ahah

Literature Cited

  1. Andrieu C, Fort G, Vihola M. 2015. Quantitative convergence rates for subgeometric Markov chains. J. Appl. Probab. 52:391–404
    [Google Scholar]
  2. Asmussen S, Glynn PW 2011. A new proof of convergence of MCMC via the ergodic theorem. Stat. Probab. Lett. 81:1482–85
    [Google Scholar]
  3. Athreya KB, Roy V 2014. Monte Carlo methods for improper target distributions. Electron. J. Stat. 8:2664–92
    [Google Scholar]
  4. Baxendale PH 2005. Renewal theory and computable convergence rates for geometrically ergodic Markov chains. Ann. Appl. Probab. 15:700–38
    [Google Scholar]
  5. Boone E, Merrick J, Krachey M 2014. A Hellinger distance approach to MCMC diagnostics. J. Stat. Comput. Simul. 84:833–49
    [Google Scholar]
  6. Brooks SP, Gelman A 1998. General methods for monitoring convergence of iterative simulations. J. Comput. Graph. Stat. 7:434–55
    [Google Scholar]
  7. Brooks SP, Roberts GO 1998. Assessing convergence of Markov chain Monte Carlo algorithms. Stat. Comput. 8:319–35
    [Google Scholar]
  8. Chakraborty S, Khare K 2017. Convergence properties of Gibbs samplers for Bayesian probit regression with proper priors. Electron. J. Stat. 11:177–210
    [Google Scholar]
  9. Choi HM, Hobert JP 2013. The Polya-Gamma Gibbs sampler for Bayesian logistic regression is uniformly ergodic. Electron. J. Stat. 7:2054–64
    [Google Scholar]
  10. Christensen OF 2004. Monte Carlo maximum likelihood in model based geostatistics. J. Comput. Graph. Stat. 13:702–18
    [Google Scholar]
  11. Cowles MK, Carlin BP 1996. Markov chain Monte Carlo convergence diagnostics: a comparative review. J. Am. Stat. Assoc. 91:883–904
    [Google Scholar]
  12. Dixit A, Roy V 2017. MCMC diagnostics for higher dimensions using Kullback Leibler divergence. J. Stat. Comput. Simul. 87:2622–38
    [Google Scholar]
  13. Doss CR, Flegal JM, Jones GL, Neath RC 2014. Markov chain Monte Carlo estimation of quantiles. Electron. J. Stat. 8:2448–78
    [Google Scholar]
  14. Durmus A, Moulines E 2015. Quantitative bounds of convergence for geometrically ergodic Markov chain in the Wasserstein distance with application to the Metropolis adjusted Langevin algorithm. Stat. Comput. 25:5–19
    [Google Scholar]
  15. Evangelou E, Roy V 2019. geoBayes: analysis of geostatistical data using Bayes and empirical Bayes methods. R package, version 0.6.2. https://cran.r-project.org/web/packages/geoBayes/index.html
    [Google Scholar]
  16. Flegal JM, Gong L 2015. Relative fixed-width stopping rules for Markov chain Monte Carlo simulations. Stat. Sinica 25:655–75
    [Google Scholar]
  17. Flegal JM, Haran M, Jones GL 2008. Markov chain Monte Carlo: Can we trust the third significant figure. Stat. Sci. 23250–60
  18. Flegal JM, Hughes J, Vats D, Dai N 2012. mcmcse: Monte Carlo standard errors for MCMC. R package version 0.1. https://cran.r-project.org/web/packages/mcmcse/index.html
    [Google Scholar]
  19. Flegal JM, Jones GL 2010. Batch means and spectral variance estimators in Markov chain Monte Carlo. Ann. Stat. 38:1034–70
    [Google Scholar]
  20. Fort G, Moulines E, Roberts G, Rosenthal J 2003. On the geometric ergodicity of hybrid samplers. J. Appl. Probab. 40:123–46
    [Google Scholar]
  21. Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB 2014. Bayesian Data Analysis. Boca Raton, FL: Chapman & Hall/CRC
    [Google Scholar]
  22. Gelman A, Rubin DB 1992. Inference from iterative simulation using multiple sequences. Stat. Sci. 7:457–72
    [Google Scholar]
  23. Geweke J 1992. Evaluating the accuracy of sampling-based approaches to calculating posterior moments. In Bayesian Statistics 4 JM Bernardo, JO Berger, AP Dawid, AFM Smith169–93 Oxford, UK: Clarendon
    [Google Scholar]
  24. Geyer CJ 2011. Introduction to Markov chain Monte Carlo. In Handbook of Markov Chain Monte Carlo S Brooks, A Gelman, GL Jones, XL Meng3–48 Boca Raton, FL: Chapman & Hall/CRC
    [Google Scholar]
  25. Geyer CJ, Johnson LT 2017. mcmc: Markov chain Monte Carlo. R package version 0.9-5. https://cran.r-project.org/web/packages/mcmc/index.html
    [Google Scholar]
  26. Geyer CJ, Thompson EA 1995. Annealing Markov chain Monte Carlo with applications to ancestral inference. J. Am. Stat. Assoc. 90:909–20
    [Google Scholar]
  27. Glynn PW, Whitt W 1992. The asymptotic validity of sequential stopping rules for stochastic simulations. Ann. Appl. Probab. 2:180–98
    [Google Scholar]
  28. Gong L, Flegal JM 2016. A practical sequential stopping rule for high-dimensional Markov chain Monte Carlo. J. Comput. Graph. Stat. 25:684–700
    [Google Scholar]
  29. Gorham J, Mackey L 2015. Measuring sample quality with Stein's method. In Advances in Neural Information Processing Systems 28 C Cortes, ND Lawrence, DD Lee, M Sugiyama, R Garnett226–34 San Diego, CA: NeurIPS
    [Google Scholar]
  30. Hadfield JD 2010. MCMC methods for multi-response generalized linear mixed models: the MCMCglmm R package. J. Stat. Softw. 33:1–22
    [Google Scholar]
  31. Heidelberger P, Welch PD 1983. Simulation run length control in the presence of an initial transient. Oper. Res. 31:1109–44
    [Google Scholar]
  32. Hijmans RJ, Phillips S, Leathwick J, Elith J 2016. dismo: species distribution modeling. R package version 1.0-15. https://cran.r-project.org/web/packages/dismo/index.html
    [Google Scholar]
  33. Hobert JP, Casella G 1996. The effect of improper priors on Gibbs sampling in hierarchical linear mixed models. J. Am. Stat. Assoc. 91:1461–73
    [Google Scholar]
  34. Hobert JP, Jones GL, Presnell B, Rosenthal JS 2002. On the applicability of regenerative simulation in Markov chain Monte Carlo. Biometrika 89:731–43
    [Google Scholar]
  35. Hobert JP, Jung YJ, Khare K, Qin Q 2018. Convergence analysis of MCMC algorithms for Bayesian multivariate linear regression with non-Gaussian errors. Scand. J. Stat. 45:513–33
    [Google Scholar]
  36. Hobert JP, Roy V, Robert CP 2011. Improving the convergence properties of the data augmentation algorithm with an application to Bayesian mixture modelling. Stat. Sci. 26:332–51
    [Google Scholar]
  37. Jacob PE, O'Leary J, Atchadé YF 2017. Unbiased Markov chain Monte Carlo with couplings. arXiv:1708.03625 [stat.ME]
    [Google Scholar]
  38. Jones GL 2004. On the Markov chain central limit theorem. Probab. Surv. 1:299–320
    [Google Scholar]
  39. Jones GL, Haran M, Caffo BS, Neath R 2006. Fixed-width output analysis for Markov chain Monte Carlo. J. Am. Stat. Assoc. 101:1537–47
    [Google Scholar]
  40. Jones GL, Hobert JP 2001. Honest exploration of intractable probability distributions via Markov chain Monte Carlo. Stat. Sci. 16:312–34
    [Google Scholar]
  41. Jones GL, Hobert JP 2004. Sufficient burn-in for Gibbs samplers for a hierarchical random effects model. Ann. Stat. 32:784–817
    [Google Scholar]
  42. Khare K, Hobert JP 2012. Geometric ergodicity of the Gibbs sampler for Bayesian quantile regression. J. Multivar. Anal. 112:108–16
    [Google Scholar]
  43. Khare K, Hobert JP 2013. Geometric ergodicity of Bayesian lasso. Electron. J. Stat. 7:2150–63
    [Google Scholar]
  44. Laha A, Dutta S, Roy V 2016. A novel sandwich algorithm for empirical Bayes analysis of rank data. Stat. Interface 10:543–56
    [Google Scholar]
  45. Leman SC, Chen Y, Lavine M 2009. The multiset sampler. J. Am. Stat. Assoc. 104:1029–41
    [Google Scholar]
  46. Martin AD, Quinn KM, Park JH 2011. MCMCpack: Markov chain Monte Carlo in R. J. Stat. Softw. 42:22
    [Google Scholar]
  47. Mengersen KL, Robert CP, Guihenneuc-Jouyaux C 1999. MCMC convergence diagnostics: a reviewww. Bayesian Stat. 6:415–40
    [Google Scholar]
  48. Mengersen KL, Tweedie RL 1996. Rates of convergence of the Hastings and Metropolis algorithms. Ann. Stat. 24:101–21
    [Google Scholar]
  49. Meyn SP, Tweedie RL 1993. Markov Chains and Stochastic Stability London: Springer
  50. Mykland P, Tierney L, Yu B 1995. Regeneration in Markov chain samplers. J. Am. Stat. Assoc. 90:233–41
    [Google Scholar]
  51. Peltonen J, Venna J, Kaski S 2009. Visualizations for assessing convergence and mixing of Markov chain Monte Carlo simulations. Comput. Stat. Data Anal. 53:4453–70
    [Google Scholar]
  52. Plummer M, Best N, Cowles K, Vines K 2006. Coda: convergence diagnosis and output analysis for MCMC. R News 6:7–11
    [Google Scholar]
  53. Polson NG, Scott JG, Windle J 2013. Bayesian inference for logistic models using Pólya-Gamma latent variables. J. Am. Stat. Assoc. 108:1339–49
    [Google Scholar]
  54. Qin Q, Hobert JP 2019. Geometric convergence bounds for Markov chains in Wasserstein distance based on generalized drift and contraction conditions. arXiv:1902.02964 [math.PR]
    [Google Scholar]
  55. R Core Team 2018. R: A language and environment for statistical computing. Statistical Software R Found. Stat. Comput., Vienna
  56. Raftery AE, Lewis SM 1992. How many iterations in the Gibbs sampler?. Bayesian Statistics 4 JM Bernardo, JO Berger, AP Dawid, AFM Smith763–73 Oxford, UK: Clarendon
  57. Robert C, Casella G 2004. Monte Carlo Statistical Methods New York: Springer. 2nd ed.
  58. Roberts GO, Rosenthal JS 2004. General state space Markov chains and MCMC algorithms. Probab. Surv. 1:20–71
    [Google Scholar]
  59. Román JC, Hobert JP 2012. Convergence analysis of the Gibbs sampler for Bayesian general linear mixed models with improper priors. Ann. Stat. 40:2823–49
    [Google Scholar]
  60. Román JC, Hobert JP 2015. Geometric ergodicity of Gibbs samplers for Bayesian general linear mixed models with proper priors. Linear Algebra Appl. 473:54–77
    [Google Scholar]
  61. Rosenthal JS 1995. Minorization conditions and convergence rates for Markov chain Monte Carlo. J. Am. Stat. Assoc. 90:558–66
    [Google Scholar]
  62. Rosenthal JS 2002. Quantitative convergence rates of Markov chains: a simple account. Electron. Commun. Probab. 7:123–28
    [Google Scholar]
  63. Roy V 2012. Convergence rates for MCMC algorithms for a robust Bayesian binary regression model. Electron. J. Stat. 6:2463–85
    [Google Scholar]
  64. Roy V, Chakraborty S 2017. Selection of tuning parameters, solution paths and standard errors for Bayesian lassos. Bayesian Anal. 12:753–78
    [Google Scholar]
  65. Roy V, Hobert JP 2007. Convergence rates and asymptotic standard errors for MCMC algorithms for Bayesian probit regression. J. R. Stat. Soc. B 69:607–23
    [Google Scholar]
  66. Roy V, Hobert JP 2010. On Monte Carlo methods for Bayesian regression models with heavy-tailed errors. J. Multivar. Anal. 101:1190–202
    [Google Scholar]
  67. Silverman BW 1986. Density Estimation for Statistics and Data Analysis Boca Raton, FL: Chapman & Hall/CRC
  68. Vats D 2017. Geometric ergodicity of Gibbs samplers in Bayesian penalized regression models. Electron. J. Stat. 11:4033–64
    [Google Scholar]
  69. Vats D, Flegal JM, Jones GL 2019. Multivariate output analysis for Markov chain Monte Carlo. Biometrika 106:321–37
    [Google Scholar]
  70. Vats D, Knudson C 2018. Revisiting the Gelman-Rubin diagnostic. arXiv:1812.09384 [stat.CO]
  71. Wang X, Roy V 2018a. Analysis of the Pólya-Gamma block Gibbs sampler for Bayesian logistic linear mixed models. Stat. Probab. Lett. 137:251–56
    [Google Scholar]
  72. Wang X, Roy V 2018b. Convergence analysis of the block Gibbs sampler for Bayesian probit linear mixed models with improper priors. Electron. J. Stat. 12:4412–39
    [Google Scholar]
  73. Wang X, Roy V 2018c. Geometric ergodicity of Pólya-Gamma Gibbs sampler for Bayesian logistic regression with a flat prior. Electron. J. Stat. 12:3295–311
    [Google Scholar]
  74. Yu B 1994. Estimating L1error of kernel estimator: monitoring convergence of Markov samplers Tech. Rep., Univ. Calif., Berkeley. https://pdfs.semanticscholar.org/78ea/f3290a6f612ac5b8f43c9cc8a5a03d267084.pdf
    [Google Scholar]
/content/journals/10.1146/annurev-statistics-031219-041300
Loading
/content/journals/10.1146/annurev-statistics-031219-041300
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error