Annual Review of Statistics and Its Application: Most Cited Articles
http://www.annualreviews.org/content/journals/statistics?TRACK=RSS
Please follow the links to view the content.Functional Data Analysis
http://www.annualreviews.org/content/journals/10.1146/annurev-statistics-041715-033624?TRACK=RSS
With the advance of modern technology, more and more data are being recorded continuously during a time interval or intermittently at several discrete time points. These are both examples of functional data, which has become a commonly encountered type of data. Functional data analysis (FDA) encompasses the statistical methodology for such data. Broadly interpreted, FDA deals with the analysis and theory of data that are in the form of functions. This paper provides an overview of FDA, starting with simple statistical notions such as mean and covariance functions, then covering some core techniques, the most popular of which is functional principal component analysis (FPCA). FPCA is an important dimension reduction tool, and in sparse data situations it can be used to impute functional data that are sparsely observed. Other dimension reduction approaches are also discussed. In addition, we review another core technique, functional linear regression, as well as clustering and classification of functional data. Beyond linear and single- or multiple- index methods, we touch upon a few nonlinear approaches that are promising for certain applications. They include additive and other nonlinear functional regression models and models that feature time warping, manifold learning, and empirical differential equations. The paper concludes with a brief discussion of future directions.Jane-Ling Wang, Jeng-Min Chiou and Hans-Georg Müller2023-12-09T15:41:25ZProbabilistic Forecasting
http://www.annualreviews.org/content/journals/10.1146/annurev-statistics-062713-085831?TRACK=RSS
A probabilistic forecast takes the form of a predictive probability distribution over future quantities or events of interest. Probabilistic forecasting aims to maximize the sharpness of the predictive distributions, subject to calibration, on the basis of the available information set. We formalize and study notions of calibration in a prediction space setting. In practice, probabilistic calibration can be checked by examining probability integral transform (PIT) histograms. Proper scoring rules such as the logarithmic score and the continuous ranked probability score serve to assess calibration and sharpness simultaneously. As a special case, consistent scoring functions provide decision-theoretically coherent tools for evaluating point forecasts. We emphasize methodological links to parametric and nonparametric distributional regression techniques, which attempt to model and to estimate conditional distribution functions; we use the context of statistically postprocessed ensemble forecasts in numerical weather prediction as an example. Throughout, we illustrate concepts and methodologies in data examples.Tilmann Gneiting and Matthias Katzfuss2023-12-09T15:40:20ZBayesian Computing with INLA: A Review
http://www.annualreviews.org/content/journals/10.1146/annurev-statistics-060116-054045?TRACK=RSS
The key operation in Bayesian inference is to compute high-dimensional integrals. An old approximate technique is the Laplace method or approximation, which dates back to Pierre-Simon Laplace (1774). This simple idea approximates the integrand with a second-order Taylor expansion around the mode and computes the integral analytically. By developing a nested version of this classical idea, combined with modern numerical techniques for sparse matrices, we obtain the approach of integrated nested Laplace approximations (INLA) to do approximate Bayesian inference for latent Gaussian models (LGMs). LGMs represent an important model abstraction for Bayesian inference and include a large proportion of the statistical models used today. In this review, we discuss the reasons for the success of the INLA approach, the R-INLA package, why it is so accurate, why the approximations are very quick to compute, and why LGMs make such a useful concept for Bayesian computing.Håvard Rue, Andrea Riebler, Sigrunn H. Sørbye, Janine B. Illian, Daniel P. Simpson and Finn K. Lindgren2023-12-09T15:40:10ZFunctional Regression
http://www.annualreviews.org/content/journals/10.1146/annurev-statistics-010814-020413?TRACK=RSS
Functional data analysis (FDA) involves the analysis of data whose ideal units of observation are functions defined on some continuous domain, and the observed data consist of a sample of functions taken from some population, sampled on a discrete grid. Ramsay & Silverman's (1997) textbook sparked the development of this field, which has accelerated in the past 10 years to become one of the fastest growing areas of statistics, fueled by the growing number of applications yielding this type of data. One unique characteristic of FDA is the need to combine information both across and within functions, which Ramsay and Silverman called replication and regularization, respectively. This article focuses on functional regression, the area of FDA that has received the most attention in applications and methodological development. First, there is an introduction to basis functions, key building blocks for regularization in functional regression methods, followed by an overview of functional regression methods, split into three types: (a) functional predictor regression (scalar-on-function), (b) functional response regression (function-on-scalar), and (c) function-on-function regression. For each, the role of replication and regularization is discussed and the methodological development described in a roughly chronological manner, at times deviating from the historical timeline to group together similar methods. The primary focus is on modeling and methodology, highlighting the modeling structures that have been developed and the various regularization approaches employed. The review concludes with a brief discussion describing potential areas of future development in this field.Jeffrey S. Morris2023-12-09T15:41:09ZAlgorithmic Fairness: Choices, Assumptions, and Definitions
http://www.annualreviews.org/content/journals/10.1146/annurev-statistics-042720-125902?TRACK=RSS
A recent wave of research has attempted to define fairness quantitatively. In particular, this work has explored what fairness might mean in the context of decisions based on the predictions of statistical and machine learning models. The rapid growth of this new field has led to wildly inconsistent motivations, terminology, and notation, presenting a serious challenge for cataloging and comparing definitions. This article attempts to bring much-needed order. First, we explicate the various choices and assumptions made—often implicitly—to justify the use of prediction-based decision-making. Next, we show how such choices and assumptions can raise fairness concerns and we present a notationally consistent catalog of fairness definitions from the literature. In doing so, we offer a concise reference for thinking through the choices, assumptions, and fairness considerations of prediction-based decision-making.Shira Mitchell, Eric Potash, Solon Barocas, Alexander D'Amour and Kristian Lum2023-12-09T15:40:21ZTopological Data Analysis
http://www.annualreviews.org/content/journals/10.1146/annurev-statistics-031017-100045?TRACK=RSS
Topological data analysis (TDA) can broadly be described as a collection of data analysis methods that find structure in data. These methods include clustering, manifold estimation, nonlinear dimension reduction, mode estimation, ridge estimation and persistent homology. This paper reviews some of these methods.Larry Wasserman2023-12-09T15:41:22ZFinite Mixture Models
http://www.annualreviews.org/content/journals/10.1146/annurev-statistics-031017-100325?TRACK=RSS
The important role of finite mixture models in the statistical analysis of data is underscored by the ever-increasing rate at which articles on mixture applications appear in the statistical and general scientific literature. The aim of this article is to provide an up-to-date account of the theory and methodological developments underlying the applications of finite mixture models. Because of their flexibility, mixture models are being increasingly exploited as a convenient, semiparametric way in which to model unknown distributional shapes. This is in addition to their obvious applications where there is group-structure in the data or where the aim is to explore the data for such structure, as in a cluster analysis. It has now been three decades since the publication of the monograph by McLachlan & Basford (1988) with an emphasis on the potential usefulness of mixture models for inference and clustering. Since then, mixture models have attracted the interest of many researchers and have found many new and interesting fields of application. Thus, the literature on mixture models has expanded enormously, and as a consequence, the bibliography here can only provide selected coverage.Geoffrey J. McLachlan, Sharon X. Lee and Suren I. Rathnayake2023-12-09T15:38:34ZLearning Deep Generative Models
http://www.annualreviews.org/content/journals/10.1146/annurev-statistics-010814-020120?TRACK=RSS
Building intelligent systems that are capable of extracting high-level representations from high-dimensional sensory data lies at the core of solving many artificial intelligence–related tasks, including object recognition, speech perception, and language understanding. Theoretical and biological arguments strongly suggest that building such systems requires models with deep architectures that involve many layers of nonlinear processing. In this article, we review several popular deep learning models, including deep belief networks and deep Boltzmann machines. We show that (a) these deep generative models, which contain many layers of latent variables and millions of parameters, can be learned efficiently, and (b) the learned high-level feature representations can be successfully applied in many application domains, including visual object recognition, information retrieval, classification, and regression tasks.Ruslan Salakhutdinov2023-12-09T15:39:02ZMicrobiome, Metagenomics, and High-Dimensional Compositional Data Analysis
http://www.annualreviews.org/content/journals/10.1146/annurev-statistics-010814-020351?TRACK=RSS
The human microbiome is the totality of all microbes in and on the human body, and its importance in health and disease has been increasingly recognized. High-throughput sequencing technologies have recently enabled scientists to obtain an unbiased quantification of all microbes constituting the microbiome. Often, a single sample can produce hundreds of millions of short sequencing reads. However, unique characteristics of the data produced by the new technologies, as well as the sheer magnitude of these data, make drawing valid biological inferences from microbiome studies difficult. Analysis of these big data poses great statistical and computational challenges. Important issues include normalization and quantification of relative taxa, bacterial genes, and metabolic abundances; incorporation of phylogenetic information into analysis of metagenomics data; and multivariate analysis of high-dimensional compositional data. We review existing methods, point out their limitations, and outline future research directions.Hongzhe Li2023-12-09T15:37:30ZOn p-Values and Bayes Factors
http://www.annualreviews.org/content/journals/10.1146/annurev-statistics-031017-100307?TRACK=RSS
The p-value quantifies the discrepancy between the data and a null hypothesis of interest, usually the assumption of no difference or no effect. A Bayesian approach allows the calibration of p-values by transforming them to direct measures of the evidence against the null hypothesis, so-called Bayes factors. We review the available literature in this area and consider two-sided significance tests for a point null hypothesis in more detail. We distinguish simple from local alternative hypotheses and contrast traditional Bayes factors based on the data with Bayes factors based on p-values or test statistics. A well-known finding is that the minimum Bayes factor, the smallest possible Bayes factor within a certain class of alternative hypotheses, provides less evidence against the null hypothesis than the corresponding p-value might suggest. It is less known that the relationship between p-values and minimum Bayes factors also depends on the sample size and on the dimension of the parameter of interest. We illustrate the transformation of p-values to minimum Bayes factors with two examples from clinical research.Leonhard Held and Manuela Ott2023-12-09T15:38:30Z