1932

Abstract

This work revisits several proposals for the ordering of multivariate data via a prescribed depth function. We argue that one of these deserves special consideration, namely, Tukey's halfspace depth, which constructs nested convex sets via intersections of halfspaces. These sets provide a natural generalization of univariate order statistics to higher dimensions and exhibit consistency and asymptotic normality as estimators of corresponding population quantities. For absolutely continuous probability measures in , we present a connection between halfspace depth and the Radon transform of the density function, which is employed to formalize both the finite-sample and asymptotic probability distributions of the random nested sets. We review multivariate goodness-of-fit statistics based on halfspace depths, which were originally proposed in the projection pursuit literature. Finally, we demonstrate the utility of halfspace ordering as an exploratory tool by studying spatial data on maximum and minimum temperatures produced by a climate simulation model.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-062614-042835
2015-04-10
2024-04-18
Loading full text...

Full text loading...

/deliver/fulltext/statistics/2/1/annurev-statistics-062614-042835.html?itemId=/content/journals/10.1146/annurev-statistics-062614-042835&mimeType=html&fmt=ahah

Literature Cited

  1. Anderson T. 1962. On the distribution of the two-sample Cramer–von Mises criterion. Ann. Math. Stat 33:1148–59 [Google Scholar]
  2. Arnold BC, Castillo E, Sarabia JM. 2009. Multivariate order statistics via multivariate concomitants. J. Multivar. Anal. 100:946–51 [Google Scholar]
  3. Barnett V. 1976. The ordering of multivariate data. J. R. Stat. Soc. A 139:318–55 [Google Scholar]
  4. Belloni A, Winkler RL. 2011. On multivariate quantiles under partial orders. Ann. Stat. 39:1125–79 [Google Scholar]
  5. Beran R, Millar P. 1986. Confidence sets for a multivariate distribution. Ann. Stat. 14:431–43 [Google Scholar]
  6. Brown BM, Hettmansperger TP. 1989. An affine invariant bivariate version of the sign test. J. R. Stat. Soc. B 51:117–25 [Google Scholar]
  7. Brozius H, de Haan L. 1987. On limiting laws for the convex hull of a sample. J. Appl. Probab. 24:852–62 [Google Scholar]
  8. Cascos I, Molchanov I. 2007. Multivariate risks and depth-trimmed regions. Finance Stoch. 11:373–97 [Google Scholar]
  9. Chakraborty B. 1999. On multivariate median regression. Bernoulli 5:683–703 [Google Scholar]
  10. Chakraborty B. 2003. On multivariate quantile regression. J. Stat. Plan. Infer. 110:109–32 [Google Scholar]
  11. Chaudhuri P. 1996. On a geometric notion of quantiles for multivariate data. J. Am. Stat. Assoc. 91:862–72 [Google Scholar]
  12. Chazelle B. 1985. On the convex layers of a planar set. IEEE Trans. Inf. Theor. 31:509–17 [Google Scholar]
  13. Claeskens G, Hubert M, Slaets L, Vakili K. 2014. Multivariate functional halfspace depth. J. Am. Stat. Assoc. 109:411–23 [Google Scholar]
  14. Collins W, Bitz C, Blackmon M, Bonan G, Bretherton C. et al. 2006. The community climate system model version 3 (CCSM3). J. Climate 19:2122–43 [Google Scholar]
  15. David HA, Nagaraja HN. 2003. Order Statistics Hoboken, NJ: Wiley, 3rd ed..
  16. Davidov O, Peddada S. 2013. Testing for the multivariate stochastic order among ordered experimental groups with application to dose–response studies. Biometrics 69:982–90 [Google Scholar]
  17. Davis RA, Mulrow E, Resnick SI. 1987. The convex hull of a random sample in . Stoch. Models 3:1–27 [Google Scholar]
  18. Deans SR. 2007. The Radon Transform and Some of Its Applications Mineola, NY: Dover Publications
  19. Donat MG, Alexander LV. 2012. The shifting probability distribution of global daytime and night-time temperatures. Geophys. Res. Lett. 39:L14707 [Google Scholar]
  20. Donoho DL, Gasko M. 1992. Breakdown properties of location estimates based on halfspace depth and projected outlyingness. Ann. Stat. 20:1803–27 [Google Scholar]
  21. Dutta S, Ghosh AK, Chaudhuri P. 2011. Some intriguing properties of Tukey's half-space depth. Bernoulli 17:1420–34 [Google Scholar]
  22. Eddy WF. 1977. A new convex hull algorithm for planar sets. ACM Trans. Math. Software 3:398–403 [Google Scholar]
  23. Eddy WF. 1980. The distribution of the convex hull of a Gaussian sample. J. Appl. Probab. 17:686–95 [Google Scholar]
  24. Eddy WF. 1983. Multivariate order statistics: the convex case. Proc. 44th Sess. Int. Stat. Inst. 2:611–14 [Google Scholar]
  25. Eddy WF, Gale JD. 1981. The convex hull of a spherically symmetric sample. Adv. Appl. Probab. 13:751–63 [Google Scholar]
  26. Efron B. 1965. The convex hull of a random set of points. Biometrika 52:331–43 [Google Scholar]
  27. Efron B. 1979. Bootstrap methods: another look at the jackknife. Ann. Stat. 7:1–26 [Google Scholar]
  28. Friedman JH, Stuetzle W. 1981. Projection pursuit regression. J. Am. Stat. Assoc. 76:817–23 [Google Scholar]
  29. Friedman JH, Stuetzle W, Schroeder A. 1984. Projection pursuit density estimation. J. Am. Stat. Assoc. 79:599–608 [Google Scholar]
  30. Friedman JH, Tukey JW. 1974. A projection pursuit algorithm for exploratory data analysis. IEEE Trans. Comput. C-23:881–89 [Google Scholar]
  31. Galambos J. 1975. Order statistics of samples from multivariate distributions. J. Am. Stat. Assoc. 70:674–80 [Google Scholar]
  32. Graham RL. 1972. An efficient algorithm for determining the convex hull of a finite planar set. Inf. Process. Lett. 1:132–33 [Google Scholar]
  33. Groeneboom P. 1988. Limit theorems for convex hulls. Probab. Theory Rel. 79:327–68 [Google Scholar]
  34. Groeneboom P. 2012. Convex hulls of uniform samples from a convex polygon. Adv. Appl. Probab. 44:330–42 [Google Scholar]
  35. Heinzle F, Anders K-H, Sester M. 2006. Pattern recognition in road networks on the example of circular road detection. Geographic Information Science M Raubal, HJ Miller, AU Frank, MF Goodchild 153–67 Berlin: Springer [Google Scholar]
  36. Huber PJ. 1985. Projection pursuit. Ann. Stat. 13:435–75 [Google Scholar]
  37. Hueter I. 1999. Limit theorems for the convex hull of random points in higher dimensions. Trans. Am. Math. Soc. 351:4337–63 [Google Scholar]
  38. Justel A, Pena D, Zamar R. 1997. A multivariate Kolmogorov–Smirnov test of goodness of fit. Stat. Probab. Lett. 35:251–59 [Google Scholar]
  39. Kirkpatrick DG, Seidel R. 1986. The ultimate planar convex hull algorithm?. SIAM J. Comput. 15:287–99 [Google Scholar]
  40. Koenker R. 2005. Quantile Regression New York: Cambridge Univ. Press
  41. Lee EK, Cook D, Klinke S, Lumley T. 2005. Projection pursuit for exploratory supervised classification. J. Comput. Graph. Stat. 14:831–46 [Google Scholar]
  42. Lepage Y. 1971. A combination of Wilcoxon's and Ansari-Bradley's statistics. Biometrika 58:213–17 [Google Scholar]
  43. Li GY, Cheng P. 1993. Some recent developments in projection pursuit in China. Stat. Sin. 3:35–51 [Google Scholar]
  44. Liu RY. 1990. On a notion of data depth based on random simplices. Ann. Stat. 18:405–14 [Google Scholar]
  45. Liu RY, Singh K. 1993. A quality index based on data depth and multivariate rank tests. J. Am. Stat. Assoc. 88:252–60 [Google Scholar]
  46. Lopez-Pintado S, Romo J. 2009. On the concept of depth for functional data. J. Am. Stat. Assoc. 104:718–34 [Google Scholar]
  47. Mahalanobis PC. 1936. On the generalized distance in statistics. Proc. Natl. Inst. Sci. (Calcutta) 2:49–55 [Google Scholar]
  48. Massé JC, Theodorescu R. 1994. Halfplane trimming for bivariate distributions. J. Multivar. Anal. 48:188–202 [Google Scholar]
  49. Massey FJ. 1951. The Kolmogorov–Smirnov test for goodness of fit. J. Am. Stat. Assoc. 46:68–78 [Google Scholar]
  50. Matheron G. 1975. Random Sets and Integral Geometry New York: Wiley
  51. McGeehin MA, Mirabelli M. 2001. The potential impacts of climate variability and change on temperature-related morbidity and mortality in the United States. Environ. Health Perspect. 109:185–89 [Google Scholar]
  52. McMaster GS, Wilhelm W. 1997. Growing degree-days: one equation, two interpretations. Agric. Forest Meteorol. 87:291–300 [Google Scholar]
  53. McNeil AJ, Smith AD. 2012. Multivariate stress scenarios and solvency. Insur. Math. Econ. 50:299–308 [Google Scholar]
  54. Mearns LO, Gutowski W, Jones R, Leung R, McGinnis S. et al. 2009. A regional climate change assessment program for North America. Eos Trans. Am. Geophys. Union 90:311 [Google Scholar]
  55. Mosler K. 2013. Depth statistics. Robustness and Complex Data Structures C Becker, R Fried, S Kuhnt 17–34 Berlin/New York: Springer [Google Scholar]
  56. Mosteller F. 1946. On some useful “inefficient” statistics. Ann. Math. Stat. 17:377–408 [Google Scholar]
  57. Nakicenovic N, Alcamo J, Davis G, de Vries B, Fenhann J. et al. 2000. Special report on emissions scenarios: a special report of Working Group III of the Intergovernmental Panel on Climate Change Tech. rep., Environ. Molec. Sci. Lab., Pacific Northwest Natl. Lab., Richland, WA
  58. Natl. Res. Counc. (National Research Council) 2003. The Polygraph and Lie Detection Committee to Review the Scientific Evidence on the Polygraph Washington, DC: Natl. Acad. Press
  59. Oja H. 1983. Descriptive statistics for multivariate distributions. Stat. Probab. Lett. 1:327–32 [Google Scholar]
  60. Pearson ES, Hartley HO, Pearson K. 1976. Biometrika Tables for Statisticians Cambridge, UK: Cambridge Univ. Press, 3rd ed..
  61. Peng S, Huang J, Sheehy JE, Laza RC, Visperas RM. et al. 2004. Rice yields decline with higher night temperature from global warming. PNAS 101:9971–75 [Google Scholar]
  62. Pitas I, Tsakalides P. 1991. Multivariate ordering in color image filtering. IEEE Trans. Circ. Syst. Vid. Technol. 1:247–59 [Google Scholar]
  63. R Core Team 2014. R: A Language and Environment for Statistical Computing Vienna: R Found. Stat. Comput http://www.R-project.org
  64. Ravagnolo O, Misztal I, Hoogenboom G. 2000. Genetic component of heat stress in dairy cattle, development of heat index function. J. Dairy Sci. 83:2120–25 [Google Scholar]
  65. Romanazzi M. 2001. Influence function of halfspace depth. J. Multivar. Anal. 77:138–61 [Google Scholar]
  66. Rousseeuw PJ, Ruts I. 1999. The depth function of a population distribution. Metrika 49:213–44 [Google Scholar]
  67. Ruts I, Rousseeuw PJ. 1996. Computing depth contours of bivariate point clouds. Comput. Stat. Data Anal. 23:153–68 [Google Scholar]
  68. Shorack GR, Wellner JA. 2009. Empirical Processes with Applications to Statistics Philadelphia: Soc. Ind. Appl. Math.
  69. Singh K. 1991. A notion of majority depth Tech Rep., Dep. Stat., Rutgers Univ., Rutgers, NJ
  70. Small CG. 1990. A survey of multidimensional medians. Int. Stat. Rev. 58:263–77 [Google Scholar]
  71. Sun Y, Genton MG. 2011. Functional boxplots. J. Comput. Graph. Stat. 20:316–34 [Google Scholar]
  72. Sun Y, Genton MG. 2012. Functional median polish. J. Agric. Biol. Environ. Stat. 17:354–76 [Google Scholar]
  73. Tukey JW. 1975. Mathematics and the picturing of data. Proc. Int. Congr. Math. 2:523–32 [Google Scholar]
  74. Tukey JW. 1977. Exploratory Data Analysis Reading, MA: Addison-Wesley
  75. Vapnik VN, Chervonenkis AYA. 1971. On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Appl. 16:264–80 [Google Scholar]
  76. Zani S, Riani M, Corbellini A. 1999. New methods for ordering multivariate data: an application to the performance of investment funds. Appl. Stoch. Models Bus. Ind. 15:485–93 [Google Scholar]
  77. Zhang J, Zhu LX, Cheng P. 1993. Exponential bounds for the uniform deviation of a kind of empirical processes, II. J. Multivar. Anal. 47:250–68 [Google Scholar]
  78. Zuo Y, Serfling R. 2000. General notions of statistical depth function. Ann. Stat. 28:461–82 [Google Scholar]
/content/journals/10.1146/annurev-statistics-062614-042835
Loading
/content/journals/10.1146/annurev-statistics-062614-042835
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error