1932

Abstract

Bayesian inference is a powerful tool for combining information in complex settings, a task of increasing importance in modern applications. However, Bayesian inference with a flawed model can produce unreliable conclusions. This review discusses approaches to performing Bayesian inference when the model is misspecified, where, by misspecified, we mean that the analyst is unwilling to act as if the model is correct. Much has been written about this topic, and in most cases we do not believe that a conventional Bayesian analysis is meaningful when there is serious model misspecification. Nevertheless, in some cases it is possible to use a well-specified model to give meaning to a Bayesian analysis of a misspecified model, and we focus on such cases. Three main classes of methods are discussed: restricted likelihood methods, which use a model based on an insufficient summary of the original data; modular inference methods, which use a model constructed from coupled submodels, with some of the submodels correctly specified; and the use of a reference model to construct a projected posterior or predictive distribution for a simplified model considered to be useful for prediction or interpretation.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-040522-015915
2024-04-22
2024-06-16
Loading full text...

Full text loading...

/deliver/fulltext/statistics/11/1/annurev-statistics-040522-015915.html?itemId=/content/journals/10.1146/annurev-statistics-040522-015915&mimeType=html&fmt=ahah

Literature Cited

  1. Afrabandpey H, Peltola T, Piironen J, Vehtari A, Kaski S. 2020.. A decision-theoretic approach for model interpretability in Bayesian framework. . Mach. Learn. 109::185576
    [Crossref] [Google Scholar]
  2. Alquier P. 2021.. User-friendly introduction to PAC-Bayes bounds. . arXiv:2110.11216v3 [stat.ML]
  3. An Z, Nott DJ, Drovandi C. 2020.. Robust Bayesian synthetic likelihood via a semi-parametric approach. . Stat. Comput. 30:(3):54357
    [Crossref] [Google Scholar]
  4. An Z, South LF, Drovandi C. 2022.. BSL: An R package for efficient parameter estimation for simulation-based models via Bayesian synthetic likelihood. . J. Stat. Softw. 101:(11):133
    [Crossref] [Google Scholar]
  5. Bashir A, Carvalho CM, Hahn PR, Jones MB. 2019.. Post-processing posteriors over precision matrices to produce sparse graph estimates. . Bayesian Anal. 14:(4):107590
    [Crossref] [Google Scholar]
  6. Bayarri MJ, Berger JO. 2000.. P values for composite null models. . J. Am. Stat. Assoc. 95::112742
    [Google Scholar]
  7. Beaton AE, Tukey JW. 1974.. The fitting of power series, meaning polynomials, illustrated on band-spectroscopic data. . Technometrics 16:(2):14785
    [Crossref] [Google Scholar]
  8. Bernardo J. 1999.. Nested hypothesis testing: The Bayesian reference criterion. . In Bayesian Statistics 6: Proceedings of the Sixth Valencia International Meeting, ed. J Bernardo, J Berger, A Dawid, A Smith , pp. 10130 Oxford, UK:: Clarendon
    [Google Scholar]
  9. Bernardo J, Smith A. 2009.. Bayesian Theory. New York:: Wiley
    [Google Scholar]
  10. Bissiri PG, Holmes CC, Walker SG. 2016.. A general framework for updating belief distributions. . J. R. Stat. Soc. Ser. B 78:(5):110330
    [Crossref] [Google Scholar]
  11. Cantoni E. 2004.. Analysis of robust quasi-deviances for generalized linear models. . J. Stat. Softw. 10:(4):19
    [Crossref] [Google Scholar]
  12. Cantoni E, Ronchetti E. 2001.. Robust inference for generalized linear models. . J. Am. Stat. Assoc. 96:(455):102230
    [Crossref] [Google Scholar]
  13. Carmona C, Nicholls G. 2020.. Semi-modular inference: enhanced learning in multi-modular models by tempering the influence of components. . In Proceedings of the Twenty-Third International Conference on Artificial Intelligence and Statistics, ed. S Chiappa, R Calandra , pp. 422635 Brookline, MA:: Microtome
    [Google Scholar]
  14. Carmona CU, Nicholls GK. 2022.. Scalable semi-modular inference with variational meta-posteriors. . arXiv:2204.00296 [stat.ML]
  15. Catalina A, Bürkner PC, Vehtari A. 2022.. Projection predictive inference for generalized linear and additive multilevel models. . In Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, ed. G Camps-Valls, FJR Ruiz, I Valera , pp. 444661 Brookline, MA:: Microtome
    [Google Scholar]
  16. Catoni O. 2007.. PAC-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning. Beachwood, OH:: Institute of Mathematical Statistics
    [Google Scholar]
  17. Chakraborty A, Nott DJ, Drovandi CC, Frazier DT, Sisson SA. 2023.. Modularized Bayesian analyses and cutting feedback in likelihood-free inference. . Stat. Comput. 33::33
    [Crossref] [Google Scholar]
  18. Cranmer K, Brehmer J, Louppe G. 2020.. The frontier of simulation-based inference. . PNAS 117:(48):3005562
    [Crossref] [Google Scholar]
  19. Drovandi C, Nott DJ, Frazier DT. 2021.. Contributed discussion on Bayesian restricted likelihood methods: conditioning on insufficient statistics in Bayesian regression. . Bayesian Anal. 16:(4):1393462
    [Google Scholar]
  20. Drovandi C, Nott DJ, Frazier DT. 2022.. Improving the accuracy of marginal approximations in likelihood-free inference via localisation. . arXiv:2207.06655 [stat.ME]
  21. Dupuis JA, Robert CP. 2003.. Variable selection in qualitative models via an entropic explanatory power. . J. Stat. Plan. Inference 111:(1):7794
    [Crossref] [Google Scholar]
  22. Escobar MD, West M. 1995.. Bayesian density estimation and inference using mixtures. . J. Am. Stat. Assoc. 90:(430):57788
    [Crossref] [Google Scholar]
  23. Evans M. 2015.. Measuring Statistical Evidence Using Relative Belief. Abingdon, UK:: Taylor & Francis
    [Google Scholar]
  24. Evans M, Moshonov H. 2006.. Checking for prior-data conflict. . Bayesian Anal. 1::893914
    [Crossref] [Google Scholar]
  25. Ferguson TS. 1973.. A Bayesian analysis of some nonparametric problems. . Ann. Stat. 1:(2):20930
    [Crossref] [Google Scholar]
  26. Fong E, Lyddon S, Holmes C. 2019.. Scalable nonparametric sampling from multimodal posteriors with the posterior bootstrap. . In Proceedings of the 36th International Conference on Machine Learning, ed. K Chaudhuri, R Salakhutdinov , pp. 195262 Brookline, MA:: Microtome
    [Google Scholar]
  27. Frazier DT, Drovandi C. 2021.. Robust approximate Bayesian inference with synthetic likelihood. . J. Comput. Graph. Stat. 30:(4):95876
    [Crossref] [Google Scholar]
  28. Frazier DT, Drovandi C, Nott DJ. 2021.. Synthetic likelihood in misspecified models: consequences and corrections. . arXiv:2104.03436 [math.ST]
  29. Frazier DT, Kohn R, Drovandi C, Gunawan D. 2023.. Reliable Bayesian inference in misspecified models. . arXiv:2302.06031 [stat.ME]
  30. Frazier DT, Nott DJ. 2023.. Guaranteed accuracy of semi-modular posteriors. . arXiv:2301.10911 [stat.ME]
  31. Frazier DT, Nott DJ, Drovandi C, Kohn R. 2022.. Bayesian inference using synthetic likelihood: asymptotics and adjustments. . J. Am. Stat. Assoc. https://doi.org/10.1080/01621459.2022.2086132
    [Google Scholar]
  32. Gelman A, Meng XL, Stern H. 1996.. Posterior predictive assessment of model fitness via realized discrepancies. . Stat. Sin. 6::733807
    [Google Scholar]
  33. Geyer CJ, Johnson LT. 2020.. mcmc: Markov chain Monte Carlo. . R Package, version 0.9–7
    [Google Scholar]
  34. Goldstein M, Wooff D. 2007.. Bayes Linear Statistics: Theory and Methods. New York:: Wiley
    [Google Scholar]
  35. Goutis C, Robert C. 1998.. Model choice in generalised linear models: a Bayesian approach via Kullback–Leibler projections. . Biometrika 85:(1):2937
    [Crossref] [Google Scholar]
  36. Greenberg DS, Nonnenmacher M, Macke JH. 2019.. Automatic posterior transformation for likelihood-free inference. . In Proceedings of the 36th International Conference on Machine Learning, 9–15 June 2019, Long Beach, California, USA, ed. K Chaudhuri, R Salakhutdinov , pp. 240414 Brookline, MA:: Microtome
    [Google Scholar]
  37. Grünwald P. 2012.. The safe Bayesian: learning the learning rate via the mixability gap. . In Algorithmic Learning Theory: 23rd International Conference, ALT 2012, Lyon, France, October 29–31, 2012, ed. NH Bshouty, N Vayatis, T Zeugmann, G Stoltz , pp. 16983 New York:: Springer
    [Google Scholar]
  38. Grünwald P, van Ommen T. 2017.. Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it. . Bayesian Anal. 12:(4):1069103
    [Crossref] [Google Scholar]
  39. Gutiérrez-Peña E, Walker SG. 2005.. Statistical decision problems and Bayesian nonparametric methods. . Int. Stat. Rev. 73:(3):30930
    [Crossref] [Google Scholar]
  40. Hahn PR, Carvalho CM. 2015.. Decoupling shrinkage and selection in Bayesian linear models: a posterior summary perspective. . J. Am. Stat. Assoc. 110:(509):43548
    [Crossref] [Google Scholar]
  41. Huber PJ. 1964.. Robust estimation of a location parameter. . Ann. Math. Stat. 35:(1):73101
    [Crossref] [Google Scholar]
  42. Huber PJ, Ronchetti EM. 2009.. Robust Statistics. New York:: Wiley
    [Google Scholar]
  43. Jacob PE, Murray LM, Holmes CC, Robert CP. 2017.. Better together? Statistical learning in models made of modules. . arXiv:1708.08719 [stat.ME]
  44. Jacob PE, O'Leary J, Atchadé YF. 2020.. Unbiased Markov chain Monte Carlo methods with couplings. . J. R. Stat. Soc. Ser. B 82:(3):543600
    [Crossref] [Google Scholar]
  45. Jewson J, Rossell D. 2022.. General Bayesian loss function selection and the use of improper models. . J. R. Stat. Soc. Ser. B 84:(5):164065
    [Crossref] [Google Scholar]
  46. Jiang W, Tanner MA. 2008.. Gibbs posterior for variable selection in high-dimensional classification and data mining. . Ann. Stat. 36:(5):220731
    [Crossref] [Google Scholar]
  47. Kass RE, Wasserman L. 1996.. Comment on “Posterior predictive assessment of model fitness via realized discrepancies,” by Gelman, Meng and Stern. . Stat. Sin. 6::77479
    [Google Scholar]
  48. Kelly RP, Nott DJ, Frazier DT, Warne DJ, Drovandi C. 2023.. Misspecification-robust sequential neural likelihood. . arXiv:2301.13368 [stat.ME]
  49. Kowal DR. 2022.. Fast, optimal, and targeted predictions using parameterized decision analysis. . J. Am. Stat. Assoc. 117:(540):187586
    [Crossref] [Google Scholar]
  50. Lewis JR. 2014.. Bayesian restricted likelihood methods. PhD Thesis, Ohio State Univ., Columbus, OH:
    [Google Scholar]
  51. Lewis JR, MacEachern SN, Lee Y. 2021.. Bayesian restricted likelihood methods: Conditioning on insufficient statistics in Bayesian regression (with discussion). . Bayesian Anal. 16:(4):13931462
    [Crossref] [Google Scholar]
  52. Lindley DV. 1968.. The choice of variables in multiple regression. . J. R. Stat. Soc. Ser. B 30:(1):3153
    [Crossref] [Google Scholar]
  53. Liu F, Bayarri MJ, Berger JO. 2009.. Modularization in Bayesian analysis, with emphasis on analysis of computer models. . Bayesian Anal. 4:(1):11950
    [Google Scholar]
  54. Liu Y, Goudie RJB. 2022a.. A general framework for cutting feedback within modularized Bayesian inference. . arXiv:2211.03274 [stat.ME]
  55. Liu Y, Goudie RJB. 2022b.. Stochastic approximation cut algorithm for inference in modularized Bayesian models. . Stat. Comput. 32:(7):115
    [Google Scholar]
  56. Lo AY. 1984.. On a class of Bayesian nonparametric estimates: I. Density estimates. . Ann. Stat. 12:(1):35157
    [Crossref] [Google Scholar]
  57. Lunn D, Best N, Spiegelhalter D, Graham G, Neuenschwander B. 2009.. Combining MCMC with `sequential' PKPD modelling. . J. Pharmacokinet. Pharmacodyn. 36::1938
    [Crossref] [Google Scholar]
  58. Lyddon SP, Holmes CC, Walker SG. 2019.. General Bayesian updating and the loss-likelihood bootstrap. . Biometrika 106:(2):46578
    [Crossref] [Google Scholar]
  59. Maechler M, Rousseeuw P, Croux C, Todorov V, Ruckstuhl A, et al. 2022.. robustbase: Basic robust statistics. . R Package, version 0.95-0
    [Google Scholar]
  60. Marin JM, Pillai NS, Robert CP, Rousseau J. 2014.. Relevant statistics for Bayesian model choice. . J. R. Stat. Soc. Ser. B 76:(5):83359
    [Crossref] [Google Scholar]
  61. Martin GM, Frazier DT, Robert CP. 2023.. Approximating Bayes in the 21st century. . Stat. Sci. In press
    [Google Scholar]
  62. McCandless L, Douglas I, Evans S, Smeeth L. 2010.. Cutting feedback in Bayesian regression adjustment for the propensity score. . Int. J. Biostat. 6::16
    [Crossref] [Google Scholar]
  63. Miller JW, Dunson DB. 2019.. Robust Bayesian inference via coarsening. . J. Am. Stat. Assoc. 114:(527):111325
    [Crossref] [Google Scholar]
  64. Newton MA, Raftery AE. 1994.. Approximate Bayesian inference with the weighted likelihood bootstrap. . J. R. Stat. Soc. Ser. B 56:(1):326
    [Crossref] [Google Scholar]
  65. Nicholls GK, Lee JE, Wu CH, Carmona CU. 2022.. Valid belief updates for prequentially additive loss functions arising in semi-modular inference. . arXiv:2201.09706 [stat.ME]
  66. Nott DJ, Leng C. 2010.. Bayesian projection approaches to variable selection in generalized linear models. . Comput. Stat. Data Anal. 54:(12):322741
    [Crossref] [Google Scholar]
  67. Nott DJ, Wang X, Evans M, Englert BG. 2020.. Checking for prior-data conflict using prior-to-posterior divergences. . Stat. Sci. 35:(2):23453
    [Crossref] [Google Scholar]
  68. O'Hagan A, Forster J. 2004.. Kendall's Advanced Theory of Statistics 2b: Bayesian Inference. London:: Arnold, 2nd ed.
    [Google Scholar]
  69. Papamakarios G, Sterratt D, Murray I. 2019.. Sequential neural likelihood: fast likelihood-free inference with autoregressive flows. . In The 22nd International Conference on Artificial Intelligence and Statistics, ed. G Camps-Valls, FJR Ruiz, I Valera , pp. 83748 Brookline, MA:: Microtome
    [Google Scholar]
  70. Peltola T. 2018.. Local interpretable model-agnostic explanations of Bayesian predictive models via Kullback-Leibler projections. . arXiv:1810.02678 [cs.LG]
  71. Phelps K. 1982.. Use of the complementary log-log function to describe dose-response relationships in insecticide evaluation field trials. . In GLIM 82: Proceedings of the International Conference on Generalised Linear Models, ed. R Gilchrist , pp. 15563 New York:: Springer
    [Google Scholar]
  72. Piironen J, Paasiniemi M, Vehtari A. 2020.. Projective inference in high-dimensional problems: prediction and feature selection. . Electron. J. Stat. 14:(1):215597
    [Crossref] [Google Scholar]
  73. Piironen J, Vehtari A. 2016.. Projection predictive model selection for Gaussian processes. . In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 12631 Red Hook, NY:: Curran
    [Google Scholar]
  74. Plummer M. 2015.. Cuts in Bayesian graphical models. . Stat. Comput. 25::3743
    [Crossref] [Google Scholar]
  75. Pompe E. 2021.. Introducing prior information in weighted likelihood bootstrap with applications to model misspecification. . arXiv:2103.14445 [stat.ME]
  76. Pompe E, Jacob PE. 2021.. Asymptotics of cut distributions and robust modular inference using posterior bootstrap. . arXiv:2110.11149 [stat.ME]
  77. Presanis AM, Ohlssen D, Spiegelhalter DJ, Angelis DD. 2013.. Conflict diagnostics in directed acyclic graphs, with applications in Bayesian evidence synthesis. . Stat. Sci. 28::37697
    [Crossref] [Google Scholar]
  78. Price LF, Drovandi CC, Lee A, Nott DJ. 2018.. Bayesian synthetic likelihood. . J. Comput. Graph. Stat. 27::111
    [Crossref] [Google Scholar]
  79. Puelz D, Hahn PR, Carvalho CM. 2017.. Variable selection in seemingly unrelated regressions with random predictors. . Bayesian Anal. 12:(4):96989
    [Crossref] [Google Scholar]
  80. Rubin DB. 1981.. The Bayesian bootstrap. . Ann. Stat. 9:(1):13034
    [Crossref] [Google Scholar]
  81. Saarela O, Belzile LR, Stephens DA. 2016.. A Bayesian view of doubly robust causal inference. . Biometrika 103:(3):66781
    [Crossref] [Google Scholar]
  82. Shawe-Taylor J, Williamson RC. 1997.. A PAC analysis of a Bayesian estimator. . In Proceedings of the Tenth Annual Conference on Computational Learning Theory, COLT '97, pp. 29 New York:: ACM
    [Google Scholar]
  83. Sisson SA, Fan Y. 2018.. ABC samplers. . In Handbook of Approximate Bayesian Computation, ed. SA Sisson, Y Fan, MA Beaumont , pp. 87123 Boca Raton, FL:: Chapman and Hall/CRC
    [Google Scholar]
  84. Sisson SA, Fan Y, Beaumont MA. 2018.. Handbook of Approximate Bayesian Computation. Boca Raton, FL:: Chapman and Hall/CRC, 1st ed.
    [Google Scholar]
  85. Stephens DA, Nobre WS, Moodie EEM, Schmidt AM. 2023.. Causal inference under mis-specification: Adjustment based on the propensity score. . Bayesian Anal. 18:(2):63994
    [Crossref] [Google Scholar]
  86. Stone M. 1961.. The opinion pool. . Ann. Math. Stat. 32::133942
    [Crossref] [Google Scholar]
  87. Thomas O, Dutta R, Corander J, Kaski S, Gutmann MU. 2022.. Likelihood-free inference by ratio estimation. . Bayesian Anal. 17:(1):131
    [Crossref] [Google Scholar]
  88. Tibshirani R. 1996.. Regression shrinkage and selection via the lasso. . J. R. Stat. Soc. Ser. B 58::26788
    [Crossref] [Google Scholar]
  89. Tran M, Nott D, Leng C. 2012.. The predictive lasso. . Stat. Comput. 22::106984
    [Crossref] [Google Scholar]
  90. Van der Vaart AW. 2000.. Asymptotic Statistics. Cambridge, UK:: Cambridge Univ. Press
    [Google Scholar]
  91. Vehtari A, Gelman A, Gabry J. 2017.. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. . Stat. Comput. 27::141332
    [Crossref] [Google Scholar]
  92. Walker S, Gutiérrez-Peña E. 2007.. Bayesian parametric inference in a nonparametric framework. . Test 16::18897
    [Crossref] [Google Scholar]
  93. Ward D, Cannon P, Beaumont M, Fasiolo M, Schmon SM. 2022.. Robust neural posterior estimation and statistical model criticism. . In Advances in Neural Information Processing Systems 35 (NeurIPS 2022), ed. S Koyejo, S Mohamed, A Agarwal, D Belgrave, K Cho, A Oh . Red Hook, NY:: Curran
    [Google Scholar]
  94. Watanabe S. 2013.. A widely applicable Bayesian information criterion. . J. Mach. Learn. Res. 14:(27):86797
    [Google Scholar]
  95. Weinstein EN, Miller JW. 2023.. Bayesian data selection. . J. Mach. Learn. Res. 24:(23):172
    [Google Scholar]
  96. Wood SN. 2010.. Statistical inference for noisy nonlinear ecological dynamic systems. . Nature 466:(7310):11024
    [Crossref] [Google Scholar]
  97. Yu X, Nott DJ, Smith MS. 2023.. Variational inference for cutting feedback in misspecified models. . Stat. Sci. In press
    [Google Scholar]
  98. Yuan A, Clarke BS. 1999.. A minimally informative likelihood for decision analysis: illustration and robustness. . Can. J. Stat. 27:(3):64965
    [Crossref] [Google Scholar]
  99. Zhang T. 2006a.. From ε-entropy to KL-entropy: analysis of minimum information complexity density estimation. . Ann. Stat. 34:(5):2180210
    [Crossref] [Google Scholar]
  100. Zhang T. 2006b.. Information-theoretic upper and lower bounds for statistical estimation. . IEEE Trans. Inf. Theory 52:(4):130721
    [Crossref] [Google Scholar]
  101. Zigler CM, Watts K, Yeh RW, Wang Y, Coull BA, Dominici F. 2013.. Model feedback in Bayesian propensity score estimation. . Biometrics 69:(1):26373
    [Crossref] [Google Scholar]
  102. Zou H. 2006.. The adaptive lasso and its oracle properties. . J. Am. Stat. Assoc. 101:(476):141829
    [Crossref] [Google Scholar]
  103. Zou H, Hastie T. 2005.. Regularization and variable selection via the elastic net. . J. R. Stat. Soc. Ser. B 67:(2):30120
    [Crossref] [Google Scholar]
/content/journals/10.1146/annurev-statistics-040522-015915
Loading
/content/journals/10.1146/annurev-statistics-040522-015915
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error