1932

Abstract

Extreme value statistics provides accurate estimates for the small occurrence probabilities of rare events. While theory and statistical tools for univariate extremes are well developed, methods for high-dimensional and complex data sets are still scarce. Appropriate notions of sparsity and connections to other fields such as machine learning, graphical models, and high-dimensional statistics have only recently been established. This article reviews the new domain of research concerned with the detection and modeling of sparse patterns in rare events. We first describe the different forms of extremal dependence that can arise between the largest observations of a multivariate random vector. We then discuss the current research topics, including clustering, principal component analysis, and graphical modeling for extremes. Identification of groups of variables that can be concomitantly extreme is also addressed. The methods are illustrated with an application to flood risk assessment.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-040620-041554
2021-03-07
2024-04-24
Loading full text...

Full text loading...

/deliver/fulltext/statistics/8/1/annurev-statistics-040620-041554.html?itemId=/content/journals/10.1146/annurev-statistics-040620-041554&mimeType=html&fmt=ahah

Literature Cited

  1. Aas K, Czado C, Frigessi A, Bakken H 2009. Pair-copula constructions of multiple dependence. Insur. Math. Econ. 44:182–98
    [Google Scholar]
  2. Agrawal R, Srikant R. 1994. Fast algorithms for mining association rules. Proceedings of the 20th International Conference on Very Large Data Bases JB Bocca, M Jarke, C Zaniolo 487–99 San Francisco: Morgan Kaufmann
    [Google Scholar]
  3. Anderson TW. 2003. An Introduction to Multivariate Statistical Analysis New York: Wiley. , 3rd. ed.
  4. Asadi P, Davison AC, Engelke S 2015. Extremes on river networks. Ann. Appl. Stat. 9:2023–50
    [Google Scholar]
  5. Asadi P, Engelke S, Davison A 2018. Optimal regionalization of extreme value distributions for flood estimation. J. Hydrol. 556:182–93
    [Google Scholar]
  6. Asenova S, Mazo G, Segers J 2020. Inference on extremal dependence in a latent Markov tree model attracted to a Hüsler–Reiss distribution. arXiv:2001.09510 [stat.ME]
  7. Balkema AA, de Haan L 1974. Residual life time at great age. Ann. Probab. 2:792–804
    [Google Scholar]
  8. Basrak B, Davis RA, Mikosch T 2002. A characterization of multivariate regular variation. Ann. Appl. Probab. 12:908–20
    [Google Scholar]
  9. Basrak B, Segers J. 2009. Regularly varying multivariate time series. Stoch. Process. Appl. 119:1055–80
    [Google Scholar]
  10. Beirlant J, Goegebeur Y, Teugels J, Segers J 2004. Statistics of Extremes: Theory and Applications New York: Wiley
  11. Bernard E, Naveau P, Vrac M, Mestre O 2013. Clustering of maxima: spatial dependencies among heavy rainfall in France. J. Clim. 26:7929–37
    [Google Scholar]
  12. Blanchard G, Bousquet O, Zwald L 2007. Statistical properties of kernel principal component analysis. Mach. Learn. 66:259–94
    [Google Scholar]
  13. Boldi MO, Davison AC. 2007. A mixture model for multivariate extremes. J. R. Stat. Soc. Ser. B 69:217–29
    [Google Scholar]
  14. Brown BM, Resnick SI. 1977. Extreme values of independent stochastic processes. J. Appl. Probab. 14:732–39
    [Google Scholar]
  15. Bücher A, Segers J, Volgushev S 2014. When uniform weak convergence fails: empirical processes for dependence functions and residuals via epi- and hypographs. Ann. Stat. 42:1598–634
    [Google Scholar]
  16. Buck J, Klüppelberg C. 2020. Recursive max-linear models with propagating noise. arXiv:2003.00362 [math.ST]
  17. Chautru E. 2015. Dimension reduction in multivariate extreme value analysis. Electron. J. Stat. 9:383–418
    [Google Scholar]
  18. Chiapino M, Sabourin A. 2017. Feature clustering for extreme events analysis, with application to extreme stream-flow data. New Frontiers in Mining Complex Patterns A Appice, M Ceci, C Loglisci, E Masciari, ZW Raś 132–47 Cham, Switz: Springer
    [Google Scholar]
  19. Chiapino M, Sabourin A, Segers J 2019. Identifying groups of variables with the potential of being large simultaneously. Extremes 22:193–222
    [Google Scholar]
  20. Coles S, Heffernan J, Tawn J 1999. Dependence measures for extreme value analyses. Extremes 2:339–65
    [Google Scholar]
  21. Coles SG. 2001. An Introduction to Statistical Modeling of Extreme Values New York: Springer
  22. Coles SG, Tawn JA. 1991. Modelling extreme multivariate events. J. R. Stat. Soc. Ser. B 53:377–92
    [Google Scholar]
  23. Cooley D, Davis RA, Naveau P 2010. The pairwise beta distribution: a flexible parametric multivariate model for extremes. J. Multivar. Anal. 101:2103–17
    [Google Scholar]
  24. Cooley D, Thibaud E. 2019. Decompositions of dependence for high-dimensional extremes. Biometrika 106:587–604
    [Google Scholar]
  25. Davis RA, Klüppelberg C, Steinkohl C 2013. Statistical inference for max-stable processes in space and time. J. R. Stat. Soc. Ser. B 75:791–819
    [Google Scholar]
  26. Davison AC, Huser R. 2015. Statistics of extremes. Annu. Rev. Stat. Appl. 2:203–35
    [Google Scholar]
  27. Davison AC, Padoan SA, Ribatet M 2012. Statistical modeling of spatial extremes. Stat. Sci. 27:161–86
    [Google Scholar]
  28. de Fondeville R, Davison AC 2018. High-dimensional peaks-over-threshold inference. Biometrika 105:575–92
    [Google Scholar]
  29. de Haan L, Ferreira A 2006. Extreme Value Theory New York: Springer
  30. de Haan L, Zhou C 2011. Extreme residual dependence for random vectors and processes. Adv. Appl. Probab. 43:217–42
    [Google Scholar]
  31. Dhillon IS, Modha DS. 2001. Concept decompositions for large sparse text data using clustering. Mach. Learn. 42:143–75
    [Google Scholar]
  32. Dieker AB, Mikosch T. 2015. Exact simulation of Brown–Resnick random fields at a finite number of locations. Extremes 18:301–14
    [Google Scholar]
  33. Dombry C, Engelke S, Oesting M 2016. Exact simulation of max-stable processes. Biometrika 103:303–17
    [Google Scholar]
  34. Dombry C, Engelke S, Oesting M 2017a. Asymptotic properties of the maximum likelihood estimator for multivariate extreme value distributions. arXiv:1612.05178 [math.ST]
  35. Dombry C, Engelke S, Oesting M 2017b. Bayesian inference for multivariate extreme value distributions. Electron. J. Stat. 11:4813–44
    [Google Scholar]
  36. Dombry C, Eyi-Minko F, Ribatet M 2013. Conditional simulation of max-stable processes. Biometrika 100:111–24
    [Google Scholar]
  37. Drees H, Huang X. 1998. Best attainable rates of convergence for estimators of the stable tail dependence function. J. Multivar. Anal. 64:25–46
    [Google Scholar]
  38. Drees H, Sabourin A. 2019. Principal component analysis for multivariate extremes. arXiv:1906.11043 [math.ST]
  39. Drton M, Maathuis MH. 2017. Structure learning in graphical modeling. Annu. Rev. Stat. Appl. 4:365–93
    [Google Scholar]
  40. Duchi J, Shalev-Shwartz S, Singer Y, Chandra T 2008. Efficient projections onto the l1-ball for learning in high dimensions. ICML '08: Proceedings of the 25th International Conference on Machine Learning272–79 New York: ACM
    [Google Scholar]
  41. Eastoe EF, Tawn JA. 2012. Modelling the distribution of the cluster maxima of exceedances of subasymptotic thresholds. Biometrika 99:43–55
    [Google Scholar]
  42. Einmahl JHJ, Kiriliouk A, Krajina A, Segers J 2016. An M-estimator of spatial tail dependence. J. R. Stat. Soc. Ser. B 78:275–98
    [Google Scholar]
  43. Einmahl JHJ, Kiriliouk A, Segers J 2018. A continuous updating weighted least squares estimator of tail dependence in high dimensions. Extremes 21:205–33
    [Google Scholar]
  44. Einmahl JHJ, Krajina A, Segers J 2012. An M-estimator for tail dependence in arbitrary dimensions. Ann. Stat. 40:1764–93
    [Google Scholar]
  45. Embrechts P, Klüppelberg C, Mikosch T 1997. Modelling Extremal Events: For Insurance and Finance New York: Springer
  46. Engelke S, de Fondeville R, Oesting M 2019a. Extremal behaviour of aggregated data with an application to downscaling. Biometrika 106:127–44
    [Google Scholar]
  47. Engelke S, Hitz A. 2019. Graphical models for extremes (with discussion). arXiv:1812.01734 [math.ST]
  48. Engelke S, Hitz SA, Gnecco N 2019b. graphicalExtremes: statistical methodology for graphical extreme value models. R package version 0.1.0. https://CRAN.R-project.org/package=graphicalExtremes
    [Google Scholar]
  49. Engelke S, Malinowski A, Kabluchko Z, Schlather M 2015. Estimation of Hüsler–Reiss distributions and Brown–Resnick processes. J. R. Stat. Soc. Ser. B 77:239–65
    [Google Scholar]
  50. Engelke S, Opitz T, Wadsworth J 2019c. Extremal dependence of random scale constructions. Extremes 22:623–66
    [Google Scholar]
  51. Engelke S, Volgushev S. 2019. The extremal variogram and tree structure learning Paper presented at the 11th International Conference on Extreme Value Analysis, July 1–5 Zagreb, Croat:.
  52. Fisher RA, Tippett LHC. 1928. Limiting forms of the frequency distribution of the largest or smallest member of a sample. Math. Proc. Camb. Philos. Soc. 24:180–90
    [Google Scholar]
  53. Gissibl N, Klüppelberg C. 2018. Max-linear models on directed acyclic graphs. Bernoulli 24:2693–720
    [Google Scholar]
  54. Gissibl N, Klüppelberg C, Lauritzen S 2019. Identifiability and estimation of recursive max-linear models. arXiv:1901.03556 [math.ST]
  55. Gnecco N, Meinshausen N, Peters J, Engelke S 2019. Causal discovery in heavy-tailed models. arXiv:1908.05097 [stat.ME]
  56. Goix N, Sabourin A, Clémençon S 2016. Sparse representation of multivariate extremes with applications to anomaly ranking. Proc. Mach. Learn. Res. 51:75–83
    [Google Scholar]
  57. Goix N, Sabourin A, Clémençon S 2017. Sparse representation of multivariate extremes with applications to anomaly detection. J. Multivar. Anal. 161:12–31
    [Google Scholar]
  58. Gudendorf G, Segers J. 2010. Extreme-value copulas. Copula Theory and Its Applications: Proceedings of the Workshop Held in Warsaw, 25–26 September 2009 P Jaworski, F Durante, WK Härdle, WK Rychlik 127–45 New York: Springer
    [Google Scholar]
  59. Hannart A, Pearl J, Otto FEL, Naveau P, Ghil M 2016. Causal counterfactual theory for the attribution of weather and climate-related events. Bull. Am. Meteorol. Soc. 97:99–110
    [Google Scholar]
  60. Heffernan JE, Tawn JA. 2004. A conditional approach for multivariate extreme values (with discussion). J. R. Stat. Soc. Ser. B 66:497–546
    [Google Scholar]
  61. Hill BM. 1975. A simple general approach to inference about the tail of a distribution. Ann. Stat. 3:1163–74
    [Google Scholar]
  62. Hitz SA, Evans JR. 2016. One-component regular variation and graphical modeling of extremes. J. Appl. Probab. 53:733–46
    [Google Scholar]
  63. Huang X. 1992. Statistics of bivariate extreme value theory Ph.D. Thesis, Erasmus Univ Rotterdam:
  64. Huser R, Dombry C, Ribatet M, Genton MG 2019. Full likelihood inference for max-stable data. Stat. 8:e218
    [Google Scholar]
  65. Huser R, Wadsworth JL. 2019. Modeling spatial processes with unknown extremal dependence class. J. Am. Stat. Assoc. 114:434–44
    [Google Scholar]
  66. Hüsler J, Reiss RD. 1989. Maxima of normal random vectors: between independence and complete dependence. Stat. Probab. Lett. 7:283–86
    [Google Scholar]
  67. Janssen A, Segers J. 2014. Markov tail chains. J. Appl. Probab. 51:1133–53
    [Google Scholar]
  68. Janssen A, Wan P. 2019. k-means clustering of extremes. arXiv:1904.02970 [stat.ME]
  69. Jung S, Dryden IL, Marron JS 2012. Analysis of principal nested spheres. Biometrika 99:551–68
    [Google Scholar]
  70. Kabluchko Z, Schlather M, de Haan L 2009. Stationary max-stable fields associated to negative definite functions. Ann. Probab. 37:2042–65
    [Google Scholar]
  71. Katz RW, Parlange MB, Naveau P 2002. Statistics of extremes in hydrology. Adv. Water Resour. 25:1287–304
    [Google Scholar]
  72. Keef C, Tawn J, Svensson C 2009. Spatial risk assessment for extreme river flows. J. R. Stat. Soc. Ser. C 58:601–18
    [Google Scholar]
  73. Kiefer J, Wolfowitz J. 1956. Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Ann. Math. Stat. 27:887–906
    [Google Scholar]
  74. Klüppelberg C, Haug S, Kuhn G 2015. Copula structure analysis based on extreme dependence. Stat. Interface 8:93–107
    [Google Scholar]
  75. Klüppelberg C, Lauritzen S. 2019. Bayesian networks for max-linear models. arXiv:1901.03948 [stat.ME]
  76. Klüppelberg C, Sönmez E. 2020. Max-linear models on infinite graphs generated by Bernoulli bond percolation. arXiv:1804.06102 [math.PR]
  77. Kruskal JBJr 1956. On the shortest spanning subtree of a graph and the traveling salesman problem. Proc. Am. Math. Soc. 7:48–50
    [Google Scholar]
  78. Larsson M, Resnick SI. 2012. Extremal dependence measure and extremogram: the regularly varying case. Extremes 15:231–56
    [Google Scholar]
  79. Lauritzen SL. 1996. Graphical Models Oxford, UK: Oxford Univ. Press
  80. Ledford AW, Tawn JA. 1997. Modelling dependence within joint tail regions. J. R. Stat. Soc. Ser. B 59:475–99
    [Google Scholar]
  81. Lee D, Joe H. 2018. Multivariate extreme value copulas with factor and tree dependence structures. Extremes 21:147–76
    [Google Scholar]
  82. Lehtomaa J, Resnick S. 2019. Asymptotic independence and support detection techniques for heavy-tailed multivariate data. arXiv:1904.00917 [math.ST]
  83. Lindskog F, Resnick SI, Roy J 2014. Regularly varying measures on metric spaces: hidden regular variation and hidden jumps. Probab. Surv. 11:270–314
    [Google Scholar]
  84. McNeil AJ, Frey R, Embrechts P 2015. Quantitative Risk Management: Concepts, Techniques and Tools Princeton, NJ: Princeton Univ. Press
  85. Meyer N, Wintenberger O. 2019. Detection of extremal directions via Euclidean projections. arXiv:1907.00686 [stat.ML]
  86. Mhalla L, Chavez-Demoulin V, Dupuis DJ 2019. Causal mechanism of extreme river discharges in the upper Danube basin network. arXiv:1907.03555 [stat.AP]
  87. Naveau P, Hannart A, Ribes A 2020. Statistical methods for extreme event attribution in climate science. Annu. Rev. Stat. Appl. 7:89–110
    [Google Scholar]
  88. Naveau P, Ribes A, Zwiers F, Hannart A, Tuel A, Yiou P 2018. Revising return periods for record events in a climate event attribution context. J. Clim. 31:3411–22
    [Google Scholar]
  89. Opitz T. 2013. Extremal t processes: elliptical domain of attraction and a spectral representation. J. Multivar. Anal. 122:409–13
    [Google Scholar]
  90. Padoan SA, Ribatet M, Sisson SA 2010. Likelihood-based inference for max-stable processes. J. Am. Stat. Assoc. 105:263–77
    [Google Scholar]
  91. Papastathopoulos I, Strokorb K. 2016. Conditional independence among max-stable laws. Stat. Probab. Lett. 108:9–15
    [Google Scholar]
  92. Papastathopoulos I, Strokorb K, Tawn JA, Butler A 2017. Extreme events of Markov chains. Adv. Appl. Probab. 49:134–61
    [Google Scholar]
  93. Pearl J. 2009. Causality Cambridge, UK: Cambridge Univ. Press. , 2nd. ed.
  94. Peng L. 1999. Estimation of the coefficient of tail dependence in bivariate extremes. Stat. Probab. Lett. 43:399–409
    [Google Scholar]
  95. Pickands JIII 1975. Statistical inference using extreme order statistics. Ann. Stat. 3:119–31
    [Google Scholar]
  96. Poon SH, Rockinger M, Tawn J 2004. Extreme value dependence in financial markets: diagnostics, models, and financial implications. Rev. Financ. Stud. 17:581–610
    [Google Scholar]
  97. Prim RC. 1957. Shortest connection networks and some generalizations. Bell Syst. Tech. J. 36:1389–401
    [Google Scholar]
  98. journal 2019. R: A language and environment for statistical computing. Statistical Software R Found. Stat. Comput Vienna:
    [Google Scholar]
  99. Ramos A, Ledford A. 2009. A new class of models for bivariate joint tails. J. R. Stat. Soc. Ser. B 71:219–41
    [Google Scholar]
  100. Reich BJ, Shaby BA. 2012. A hierarchical max-stable spatial model for extreme precipitation. Ann. Appl. Stat. 6:1430–51
    [Google Scholar]
  101. Resnick SI. 2008. Extreme Values, Regular Variation and Point Processes New York: Springer
  102. Rootzén H, Tajvidi N. 2006. Multivariate generalized Pareto distributions. Bernoulli 12:917–30
    [Google Scholar]
  103. Samorodnitsky G, Resnick S, Towsley D, Davis R, Willis A, Wan P 2016. Nonstandard regular variation of in-degree and out-degree in the preferential attachment model. J. Appl. Probab. 53:146–61
    [Google Scholar]
  104. Saunders KR, Stephenson AG, Karoly DJ 2019. A regionalisation approach for rainfall based on extremal dependence. arXiv:1907.05750 [stat.AP]
  105. Schlather M. 2002. Models for stationary max-stable random fields. Extremes 5:33–44
    [Google Scholar]
  106. Schlather M, Tawn J. 2002. Inequalities for the extremal coefficients of multivariate extreme value distributions. Extremes 5:87–102
    [Google Scholar]
  107. Seber GAF. 1984. Multivariate Observations New York: Wiley
  108. Segers J. 2019. One- versus multi-component regular variation and extremes of Markov trees. arXiv:1902.02226 [math.PR]
  109. Simpson E, Wadsworth J, Tawn J 2018. Determining the dependence structure of multivariate extremes. arXiv:1809.01606 [stat.ME]
  110. Smith RL. 1992. The extremal index for a Markov chain. J. Appl. Probab. 29:37–45
    [Google Scholar]
  111. Smith RL, Tawn J, Coles S 1997. Markov chain models for threshold exceedances. Biometrika 84:249–68
    [Google Scholar]
  112. Spirtes P, Glymour C, Scheines R 2000. Causation, Prediction, and Search Cambridge, MA: MIT Press. , 2nd. ed.
  113. Strokorb K. 2020. Extremal independence old and new. arXiv:2002.07808 [math.ST]
  114. Strokorb K, Schlather M. 2015. An exceptional max-stable process fully parameterized by its extremal coefficients. Bernoulli 21:276–302
    [Google Scholar]
  115. Tawn JA. 1988. Bivariate extreme value theory: models and estimation. Biometrika 75:397–415
    [Google Scholar]
  116. Thibaud E, Aalto J, Cooley DS, Davison AC, Heikkinen J 2016. Bayesian inference for the Brown–Resnick process, with an application to extreme low temperatures. Ann. Appl. Stat. 10:2303–24
    [Google Scholar]
  117. Varin C, Reid N, Firth D 2011. An overview of composite likelihood methods. Stat. Sin. 21:5–42
    [Google Scholar]
  118. Wackernagel H. 2013. Multivariate Geostatistics New York: Springer
  119. Wadsworth JL, Tawn JA. 2012. Dependence modelling for spatial extremes. Biometrika 99:253–72
    [Google Scholar]
  120. Wadsworth JL, Tawn JA. 2014. Efficient inference for spatial extreme value processes associated to log-Gaussian random functions. Biometrika 101:1–15
    [Google Scholar]
  121. Wadsworth JL, Tawn JA, Davison AC, Elton DM 2017. Modelling across extremal dependence classes. J. R. Stat. Soc. Ser. B 79:149–75
    [Google Scholar]
  122. Wainwright MJ, Jordan MI. 2008. Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1:1–305
    [Google Scholar]
  123. Wan P, Wang T, Davis RA, Resnick SI 2020. Are extreme value estimation methods useful for network data. ? Extremes 23:171–95
    [Google Scholar]
  124. Westra S, Sisson SA. 2011. Detection of non-stationarity in precipitation extremes using a max-stable process model. J. Hydrol. 406:119–28
    [Google Scholar]
  125. Yu H, Uy WIT, Dauwels J 2017. Modeling spatial extremes via ensemble-of-trees of pairwise copulas. IEEE Trans. Signal Proc. 65:571–86
    [Google Scholar]
  126. Yuen R, Stoev S. 2014. CRPS M-estimation for max-stable models. Extremes 17:387–410
    [Google Scholar]
  127. Zhou C. 2010. Dependence structure of risk factors and diversification effects. Insur. Math. Econ. 46:531–40
    [Google Scholar]
  128. Zou N, Volgushev S, Bücher A 2019. Multiple block sizes and overlapping blocks for multivariate time series extremes. arXiv:1907.09477 [math.ST]
  129. Zscheischler J, Seneviratne SI. 2017. Dependence of drivers affects risks associated with compound events. Sci. Adv. 3:e1700263
    [Google Scholar]
/content/journals/10.1146/annurev-statistics-040620-041554
Loading
/content/journals/10.1146/annurev-statistics-040620-041554
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error