On the Frequentist Properties of Bayesian Nonparametric Methods

Judith Rousseau

doi:10.1146/annurev-statistics-041715-033523

Annual Review of Statistics and Its Application

Volume 3, 2016

Review Article

Free

On the Frequentist Properties of Bayesian Nonparametric Methods

Judith Rousseau^1,2
View Affiliations Hide Affiliations

Affiliations: ¹CEREMADE, Université Paris Dauphine, Paris 75016, France; email: [email protected] ²Laboratoire de Statistique, CREST-ENSAE, Malakoff 92245, France
Vol. 3:211-231 (Volume publication date June 2016) https://doi.org/10.1146/annurev-statistics-041715-033523
© Annual Reviews

Abstract

In this paper, I review the main results on the asymptotic properties of the posterior distribution in nonparametric or high-dimensional models. In particular, I explain how posterior concentration rates can be derived and what we learn from such analysis in terms of the impact of the prior distribution on high-dimensional models. These results concern fully Bayes and empirical Bayes procedures. I also describe some of the results that have been obtained recently in semiparametric models, focusing mainly on the Bernstein–von Mises property. Although these results are theoretical in nature, they shed light on some subtle behaviors of the prior models and sharpen our understanding of the family of functionals that can be well estimated for a given prior model.

Keyword(s): asymptotics, Bayesian nonparametrics, Bernstein–von Mises, empirical Bayes, posterior concentration

Article metrics loading...

/content/journals/10.1146/annurev-statistics-041715-033523

2016-06-01

2024-05-10

Full text loading...

/deliver/fulltext/statistics/3/1/annurev-statistics-041715-033523.html?itemId=/content/journals/10.1146/annurev-statistics-041715-033523&mimeType=html&fmt=ahah

Literature Cited

Arbel J, Gayraud G, Rousseau J. 2013. Bayesian adaptive optimal estimation using a sieve prior. Scand. J. Stat. 40:549–70 [Google Scholar]
Banerjee S, Ghosal S. 2015. Bayesian structure learning in graphical models. J. Multivar. Anal. 136:147–62 [Google Scholar]
Barron A. 1988. The exponential convergence of posterior probabilities with implications for Bayes estimators of density functions Tech. Rep. 7, Univ. Illinois, Urbana-Campaign, IL
Barron A, Schervish M, Wasserman L. 1999. The consistency of posterior distributions in nonparametric problems. Ann. Stat. 27:536–61 [Google Scholar]
Belitser E, Levit B. 2003. On the empirical Bayes approach to adaptive filtering in the Gaussian model. Math. Methods Stat. 12:131–54 [Google Scholar]
Belitser E, Serra P, van Zanten JH. 2013. Estimating the period of a cyclic non-homogeneous Poisson process. Scand. J. Stat. 40:204–18 [Google Scholar]
Bhattacharya A, Dunson DB. 2011. Sparse Bayesian infinite factor models. Biometrika 98:291–306 [Google Scholar]
Bhattacharya A, Pati D, Dunson D. 2014. Anisotropic function estimation using multi-bandwidth Gaussian processes. Ann. Stat. 42:352–81 [Google Scholar]
Bhattacharya A, Pati D, Pillai N, Dunson D. 2015. Dirichlet-Laplace priors for optimal shrinkage. J. Am. Stat. Assoc. 110:5121479–90 [Google Scholar]
Bickel PJ, Kleijn BJK. 2012. The semiparametric Bernstein–von Mises theorem. Ann. Stat. 40:206–37 [Google Scholar]
Birgé L. 1983. Approximation dans les espaces métriques et théorie de l'estimation. Probab. Theory Relat. Fields 65:181–237 [Google Scholar]
Cai T, Low M. 2006. Adaptive confidence balls. Ann. Stat. 34:202–28 [Google Scholar]
Canale A, de Blasi P. 2013. Posterior consistency of nonparametric location-scale mixtures of multivariate Gaussian density estimation. arXiv:1306.2671 [math.ST]
Castillo I. 2008. Lower bounds for posterior rates with Gaussian process priors. Electron. J. Stat. 2:1281–99 [Google Scholar]
Castillo I. 2010. A semiparametric Bernstein–von Mises theorem for Gaussian process priors. Probab. Theory Relat. Fields 152:53–99 [Google Scholar]
Castillo I. 2012. Semiparametric Bernstein–von Mises theorem and bias, illustrated with Gaussian process priors. Sankhya A 74:2194–221 [Google Scholar]
Castillo I, Nickl R. 2013. Nonparametric Bernstein–von Mises theorems in Gaussian white noise. Ann. Stat. 41:41999–2028 [Google Scholar]
Castillo I, Rousseau J. 2015. A general Bernstein–von Mises theorem for smooth functionals in semi-parametric models. Ann. Stat. 43:2353–83 [Google Scholar]
Castillo I, Schmidt-Hieber J, Van der Vaart A. 2015. Bayesian linear regression with sparse priors. Ann. Stat. 43:1986–2018 [Google Scholar]
Castillo I, van der Vaart A. 2012. Needles and straw in a haystack: posterior concentration for possibly sparse sequences. Ann. Stat. 40:2069–101 [Google Scholar]
Choi T, Schervish M. 2007. On posterior consistency in nonparametric regression problems. J. Multivar. Anal. 98:1969–87 [Google Scholar]
Clyde MA, George EI. 2000. Flexible empirical Bayes estimation for wavelets. J. R. Stat. Soc. Ser. B 62:681–98 [Google Scholar]
Cox D. 1993. An analysis of Bayesian inference for nonparametric regression. Ann. Stat. 21:903–23 [Google Scholar]
Cui W, George EI. 2008. Empirical Bayes versus fully Bayes variable selection. J. Stat. Plann. Inference 138:888–900 [Google Scholar]
de Finetti B. 1937. La prédiction: ses logiques, ses sources prédictives. Ann. Inst. Henri Poincaré 7:1–68 [Google Scholar]
de Jonge R, van Zanten JH. 2010. Adaptive nonparametric Bayesian inference using location-scale mixture priors. Ann. Stat. 38:3300–20 [Google Scholar]
Dey D, Möller P, Sinha D. 1998. Practical Nonparametric and Semiparametric Bayesian Statistics Lect. Notes Stat 133 New York: Springer
Diaconis P, Freedman D. 1986. On the consistency of Bayes estimates. Ann. Stat. 14:1–26 [Google Scholar]
Donnet S, Rivoirard V, Rousseau J, Scricciolo C. 2014a. Posterior concentration rates for counting processes with Aalen multiplicative intensities. arXiv:1407.6033v1 [stat.ME]
Donnet S, Rivoirard V, Rousseau J, Scricciolo C. 2014b. Posterior concentration rates for empirical Bayes procedures, with applications to Dirichlet process mixtures. arXiv:1406.4406v1 [math.ST]
Ferguson T. 1974. Prior distributions in spaces of probability measures. Ann. Stat. 2:615–29 [Google Scholar]
Freedman D. 1999. On the Bernstein–von Mises theorem with infinite dimensional parameter. Ann. Stat. 27:1119–40 [Google Scholar]
George EI, Foster DP. 2000. Calibration and empirical Bayes variable selection. Biometrika 87:731–47 [Google Scholar]
Ghosal S. 2001. Convergence rates for density estimation with Bernstein polynomials. Ann. Stat. 29:51264–80 [Google Scholar]
Ghosal S, Ghosh JK, van der Vaart A. 2000. Convergence rates of posterior distributions. Ann. Stat. 28:500–31 [Google Scholar]
Ghosal S, Roy A. 2006. Posterior consistency of Gaussian process prior for nonparametric binary regression. Ann. Stat. 34:2413–29 [Google Scholar]
Ghosal S, Tang Y. 2006. Bayesian consistency for Markov processes. Sankhya 68:227–39 [Google Scholar]
Ghosal S, van der Vaart A. 2007a. Convergence rates of posterior distributions for non-i.i.d. observations. Ann. Stat. 35:1192–223 [Google Scholar]
Ghosal S, van der Vaart A. 2007b. Posterior convergence rates of Dirichlet mixtures at smooth densities. Ann. Stat. 35:2697–723 [Google Scholar]
Ghosh JK, Ramamoorthi RV. 2003. Bayesian Nonparametrics New York: Springer-Verlag
Green P, Richardson S. 2001. Modelling heterogeneity with and without the Dirichlet process. Scand. J. Stat. 28:2355–75 [Google Scholar]
Hjort NL, Holmes C, Möller P, Walker SG. 2010. Bayesian Nonparametrics Cambridge, UK: Cambridge Univ. Press
Hoffman M, Rousseau J, Schmidt-Hieber J. 2013. On adaptive posterior concentration rates. Ann. Stat. 43:2259–95 [Google Scholar]
Knapik B, Salomond J. 2015. A general approach to posterior contraction in nonparametric inverse problems. arXiv:1407.0335 [math.ST]
Knapik BT, Szabó BT, van der Vaart AW, van Zanten JH. 2015. Bayes procedures for adaptive inference in inverse problems for the white noise model. Probab. Theory Relat. Fields. doi: 10.1007/s00440-015-0619-7
Kruijer W, Rousseau J, van der Vaart A. 2010. Adaptive Bayesian density estimation with location-scale mixtures. Electron. J. Stat. 4:1225–57 [Google Scholar]
Kyung G, Casella G. 2010. Estimation in Dirichlet random effects models. Ann. Stat. 38:979–1009 [Google Scholar]
Lavine M. 1992. Some aspects of Polya tree distributions for statistical modelling. Ann. Stat. 20:1222–35 [Google Scholar]
Lijoi A, Prünster I. 2009. Models beyond the Dirichlet process Work. Pap. 129 Collegio Carlo Alberto http://www.carloalberto.org/assets/working-papers/no.129.pdf
Lijoi A, Prünster I, Walker S. 2005. On consistency of nonparametric normal mixtures for Bayesian density estimation. J. Am. Stat. Assoc. 100:1292–96 [Google Scholar]
Liu JS. 1996. Nonparametric hierarchical Bayes via sequential imputation. Ann. Stat. 24:911–30 [Google Scholar]
Pati D, Bhattacharya A, Pillai N, Dunson D. 2014. Posterior contraction in sparse Bayesian factor models for massive covariance matrices. Ann. Stat. 42:1102–30 [Google Scholar]
Petrone S, Rousseau J, Scricciolo C. 2014. Bayes and empirical Bayes: Do they merge?. Biometrika 101:285–302 [Google Scholar]
Rasmussen CE, Williams CKI. 2006. Gaussian Processes for Machine Learning Cambridge, MA: MIT Press
Ray K. 2013. Bayesian inverse problems with non-conjugate priors. Electron. J. Stat. 7:2516–49 [Google Scholar]
Richardson S, Green P. 1997. On Bayesian analysis of mixtures with an unknown number of components (with discussion). J. R. Stat. Soc. Ser. B 59:731–92 [Google Scholar]
Rivoirard V, Rousseau J. 2012a. On the Bernstein–von Mises theorem for linear functionals of the density. Ann. Stat. 40:1489–523 [Google Scholar]
Rivoirard V, Rousseau J. 2012b. Posterior concentration rates for infinite dimensional exponential families. Bayesian Anal. 7:311–34 [Google Scholar]
Robbins H. 1964. The empirical Bayes approach to statistical decision problems. Ann. Mathemat. Stat. 35:1–20 [Google Scholar]
Rousseau J. 2010. Rates of convergence for the posterior distributions of mixtures of Betas and adaptive nonparametric estimation of the density. Ann. Stat. 38:146–80 [Google Scholar]
Rousseau J, Chopin N, Liseo B. 2012. Bayesian nonparametric estimation of the spectral density of a long memory Gaussian process. Ann. Stat. 40:964–95 [Google Scholar]
Rousseau J, Szabó BT. 2015. Asymptotic behaviour of the empirical Bayes posteriors associated to maximum marginal likelihood estimator. arXiv:1504.04814 [math.ST]
Schwartz L. 1965. On Bayes procedures. Z. Warsch. Verw. Gebiete 4:10–26 [Google Scholar]
Scott JG, Berger JO. 2010. Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. Ann. Stat. 38:52587–619 [Google Scholar]
Scricciolo C. 2014. Adaptive Bayesian density estimation in L^p-metrics with Pitman-Yor or normalized inverse-Gaussian process kernel mixtures. Bayesian Anal. 9:475–520 [Google Scholar]
Sethuraman J. 1994. A constructive definition of Dirichlet priors. Stat. Sin. 4:639–50 [Google Scholar]
Shen W, Tokdar S, Ghosal S. 2013. Adaptive Bayesian multivariate density estimation with Dirichlet mixtures. Biometrika 100:623–40 [Google Scholar]
Szabó BT, van der Vaart AW, van Zanten JH. 2013. Empirical Bayes scaling of Gaussian priors in the white noise model. Electron. J. Stat. 7:991–1018 [Google Scholar]
Szabó BT, van der Vaart AW, van Zanten JH. 2015. Frequentist coverage of adaptive nonparametric Bayesian credible sets. Ann. Stat. 43:1391–428 [Google Scholar]
van de Wiel M, Leday G, Pardo L, Rue H, van der Vaart A, Van Wieringen W. 2013. Bayesian analysis of RNA sequencing data by estimating multiple shrinkage priors. Biostatistics 14:113–28 [Google Scholar]
van der Vaart AW. 1998. Asymptotic Statistics Cambridge Ser. Stat. Probab. Math 3 Cambridge, UK: Cambridge Univ. Press
van der Vaart AW, van Zanten JH. 2008a. Rates of contraction of posterior distributions based on Gaussian process priors. Ann. Stat. 36:31435–63 [Google Scholar]
van der Vaart AW, van Zanten JH. 2008b. Reproducing kernel Hilbert spaces of Gaussian priors. IMS Collect. 3:200–22 [Google Scholar]
van der Vaart AW, van Zanten JH. 2009. Adaptive Bayesian estimation using a Gaussian random field with inverse Gamma bandwidth. Ann. Stat. 37:2655–75 [Google Scholar]
Vernet E. 2014. Posterior consistency for nonparametric hidden Markov models with finite state space. Electron. J. Stat. 9:717–52 [Google Scholar]
Yau C, Papaspiliopoulos O, Roberts GO, Holmes C. 2011. Bayesian non-parametric hidden Markov models with applications in genomics. J. R. Stat. Soc. Ser. B 73:1–21 [Google Scholar]

/content/journals/10.1146/annurev-statistics-041715-033523

Article Type: Review Article

Most Cited Most Cited RSS feed

- Functional Data Analysis
  
  Jane-Ling Wang, Jeng-Min Chiou, and Hans-Georg Müller
  
  Vol. 3 (2016), pp. 257–295
- Probabilistic Forecasting
  
  Tilmann Gneiting, and Matthias Katzfuss
  
  Vol. 1 (2014), pp. 125–151
- Bayesian Computing with INLA: A Review
  
  Håvard Rue, Andrea Riebler, Sigrunn H. Sørbye, Janine B. Illian, Daniel P. Simpson, and Finn K. Lindgren
  
  Vol. 4 (2017), pp. 395–421
- Functional Regression
  
  Jeffrey S. Morris
  
  Vol. 2 (2015), pp. 321–359
- Topological Data Analysis
  
  Larry Wasserman
  
  Vol. 5 (2018), pp. 501–532
- Algorithmic Fairness: Choices, Assumptions, and Definitions
  
  Shira Mitchell, Eric Potash, Solon Barocas, Alexander D'Amour, and Kristian Lum
  
  Vol. 8 (2021), pp. 141–163
- Microbiome, Metagenomics, and High-Dimensional Compositional Data Analysis
  
  Hongzhe Li
  
  Vol. 2 (2015), pp. 73–94
- Learning Deep Generative Models
  
  Ruslan Salakhutdinov
  
  Vol. 2 (2015), pp. 361–385
- On p-Values and Bayes Factors
  
  Leonhard Held, and Manuela Ott
  
  Vol. 5 (2018), pp. 393–419
- High-Dimensional Statistics with a View Toward Applications in Biology
  
  Peter Bühlmann, Markus Kalisch, and Lukas Meier
  
  Vol. 1 (2014), pp. 255–278
More Less

Annual Review of Statistics and Its Application

Volume 3, 2016

Review Article

Free

On the Frequentist Properties of Bayesian Nonparametric Methods

Abstract

Most Read This Month

Most Cited Most Cited RSS feed