1932

Abstract

Robust statistics is a fairly mature field that dates back to the early 1960s, with many foundational concepts having been developed in the ensuing decades. However, the field has drawn a new surge of attention in the past decade, largely due to a desire to recast robust statistical principles in the context of high-dimensional statistics. In this article, we begin by reviewing some of the central ideas in classical robust statistics. We then discuss the need for new theory in high dimensions, using recent work in high-dimensional -estimation as an illustrative example. Next, we highlight a variety of interesting recent topics that have drawn a flurry of research activity from both statisticians and theoretical computer scientists, demonstrating the need for further research in robust estimation that embraces new estimation and contamination settings, as well as a greater emphasis on computational tractability in high dimensions.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-112723-034446
2025-03-07
2025-04-18
Loading full text...

Full text loading...

/deliver/fulltext/statistics/12/1/annurev-statistics-112723-034446.html?itemId=/content/journals/10.1146/annurev-statistics-112723-034446&mimeType=html&fmt=ahah

Literature Cited

  1. Abbe E. 2018.. Community detection and stochastic block models: recent developments. . J. Mach. Learn. Res. 18:(177):186
    [Google Scholar]
  2. Acharya J, Jain A, Kamath G, Suresh AT, Zhang H. 2022.. Robust estimation for random graphs. . Proc. Mach. Learn. Res. 178::13066
    [Google Scholar]
  3. Alfons A, Croux C, Gelper S. 2013.. Sparse least trimmed squares regression for analyzing high-dimensional large data sets. . Ann. Appl. Stat. 7:(1):22648
    [Crossref] [Google Scholar]
  4. Avella-Medina M. 2017.. Influence functions for penalized M-estimators. . Bernoulli 23:(4B):317896
    [Crossref] [Google Scholar]
  5. Avella-Medina M. 2021.. Privacy-preserving parametric inference: a case for robust statistics. . J. Am. Stat. Assoc. 116::96983
    [Crossref] [Google Scholar]
  6. Avella-Medina M, Bradshaw C, Loh P. 2023.. Differentially private inference via noisy optimization. . Ann. Stat. 51:(5):206792
    [Crossref] [Google Scholar]
  7. Avella-Medina M, Ronchetti E. 2017.. Robust and consistent variable selection in high-dimensional generalized linear models. . Biometrika 105:(1):3144
    [Crossref] [Google Scholar]
  8. Barreno M, Nelson B, Sears R, Joseph AD, Tygar JD. 2006.. Can machine learning be secure?. In ASIACCS '06: Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security, pp. 1625. New York:: ACM
    [Google Scholar]
  9. Bean D, Bickel PJ, El Karoui N, Yu B. 2013.. Optimal M-estimation in high-dimensional regression. . PNAS 110:(36):1456368
    [Crossref] [Google Scholar]
  10. Bellec PC, Zhang CH. 2023.. Debiasing convex regularized estimators and interval estimation in linear models. . Ann. Stat. 51:(2):391436
    [Crossref] [Google Scholar]
  11. Ben-Tal A, El Ghaoui L, Nemirovski A. 2009.. Robust Optimization. Princeton, NJ:: Princeton Univ. Press
    [Google Scholar]
  12. Bickel PJ. 1975.. One-step Huber estimates in the linear model. . J. Am. Stat. Assoc. 70:(350):42834
    [Crossref] [Google Scholar]
  13. Blanchet J, Kang Y, Murthy K. 2019.. Robust Wasserstein profile inference and applications to machine learning. . J. Appl. Probab. 56:(3):83057
    [Crossref] [Google Scholar]
  14. Blum A, Dwork C, McSherry F, Nissim K. 2005.. Practical privacy: the SuLQ framework. . In PODS '05: Proceedings of the Twenty-Fourth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 12838. New York:: ACM
    [Google Scholar]
  15. Brownlees C, Joly E, Lugosi G. 2015.. Empirical risk minimization for heavy-tailed losses. . Ann. Stat. 43:(6):250736
    [Crossref] [Google Scholar]
  16. Bühlmann P, van de Geer S. 2011.. Statistics for High-dimensional Data: Methods, Theory and Applications. New York:: Springer
    [Google Scholar]
  17. Cai TT, Li X. 2015.. Robust and computationally feasible community detection in the presence of arbitrary outlier nodes. . Ann. Stat. 43:(3):102759
    [Crossref] [Google Scholar]
  18. Catoni O. 2012.. Challenging the empirical mean and empirical variance: a deviation study. . Ann. Inst. H. Poincaré Probab. Stat. 48::114885
    [Crossref] [Google Scholar]
  19. Celentano M, Montanari A, Wei Y. 2023.. The Lasso with general Gaussian designs with applications to hypothesis testing. . Ann. Stat. 51:(5):2194220
    [Crossref] [Google Scholar]
  20. Chan TM. 2004.. An optimal randomized algorithm for maximum Tukey depth. . In SODA '04: Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 43036. Philadelphia, PA:: Soc. Ind. Appl. Math.
    [Google Scholar]
  21. Chao P, Dobriban E. 2023.. Statistical estimation under distribution shift: Wasserstein perturbations and minimax theory. . arXiv:2308.01853 [stat.ML]
  22. Chaudhuri K, Hsu D. 2012.. Convergence rates for differentially private statistical estimation. . In ICML'12: Proceedings of the 29th International Conference on Machine Learning, pp. 171522. Madison, WI:: Omnipress
    [Google Scholar]
  23. Chen M, Gao C, Ren Z. 2016.. A general decision theory for Huber's ϵ-contamination model. . Electron. J. Stat. 10:(2):375274
    [Google Scholar]
  24. Chen M, Gao C, Ren Z. 2018.. Robust covariance and scatter matrix estimation under Huber's contamination model. . Ann. Stat. 46:(5):193260
    [Crossref] [Google Scholar]
  25. Devroye L, Lerasle M, Lugosi G, Oliveira RI. 2016.. Sub-Gaussian mean estimators. . Ann. Stat. 44:(6):2695725
    [Crossref] [Google Scholar]
  26. Diakonikolas I, Kamath G, Kane DM, Li J, Moitra A, Stewart A. 2016.. Robust estimators in high dimensions without the computational intractability. . In 57th Annual Symposium on Foundations of Computer Science, pp. 65564. Piscataway, NJ:: IEEE
    [Google Scholar]
  27. Diakonikolas I, Kamath G, Kane DM, Li J, Steinhardt J, Stewart A. 2019.. SEVER: a robust meta-algorithm for stochastic optimization. . Proc. Mach. Learn. Res. 97::1596606
    [Google Scholar]
  28. Diakonikolas I, Kane DM, Stewart A, Sun Y. 2021.. Outlier-robust learning of Ising models under Dobrushin's condition. . Proc. Mach. Learn. Res. 134::164582
    [Google Scholar]
  29. Ding J, d'Orsi T, Hua Y, Steurer D. 2023.. Reaching Kesten-Stigum threshold in the stochastic block model under node corruptions. . Proc. Mach. Learn. Res. 195::404471
    [Google Scholar]
  30. Ding J, d'Orsi T, Nasser R, Steurer D. 2022.. Robust recovery for stochastic block models. . In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pp. 38794. Piscataway, NJ:: IEEE
    [Google Scholar]
  31. Dinur I, Nissim K. 2003.. Revealing information while preserving privacy. . In PODS '03: Proceedings of the Twenty-Second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 20210. New York:: ACM
    [Google Scholar]
  32. Donoho DL, Gasko M. 1992.. Breakdown properties of location estimates based on halfspace depth and projected outlyingness. . Ann. Stat. 20:(4):180327
    [Crossref] [Google Scholar]
  33. Donoho DL, Huber PJ. 1983.. The notion of breakdown point. In A Festschrift for Erich L. Lehmann, ed. PJ Bickel, K Doksum, JL Hodges , pp. 15784. Boca Raton, FL:: CRC
    [Google Scholar]
  34. Donoho DL, Montanari A. 2016.. High dimensional robust M-estimation: asymptotic variance via approximate message passing. . Probab. Theory Relat. Fields 166:(3–4):93569
    [Crossref] [Google Scholar]
  35. Duchi J, Namkoong H. 2019.. Variance-based regularization with convex objectives. . J. Mach. Learn. Res. 20:(68):155
    [Google Scholar]
  36. Dwork C, Lei J. 2009.. Differential privacy and robust statistics. . In STOC '09: Proceedings of the Forty-First Annual ACM Symposium on Theory of Computing, pp. 37180. New York:: ACM
    [Google Scholar]
  37. Dwork C, McSherry F, Nissim K, Smith A. 2006.. Calibrating noise to sensitivity in private data analysis. . In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4–7, 2006, ed. S Halevi, T Rabin , pp. 26584. New York:: Springer
    [Google Scholar]
  38. Dwork C, Nissim K. 2004.. Privacy-preserving datamining on vertically partitioned databases. . In Advances in Cryptology–CRYPTO 2004: 24th Annual International Cryptology Conference, Santa Barbara, California, USA, August 15–19, 2004, ed. M Franklin , pp. 52844. New York:: Springer
    [Google Scholar]
  39. Dwork C, Roth A. 2014.. The algorithmic foundations of differential privacy. . Found. Trends Theor. Comput. Sci. 9:(3–4):211407
    [Google Scholar]
  40. El Karoui N, Bean D, Bickel PJ, Lim C, Yu B. 2013.. On robust regression with high-dimensional predictors. . PNAS 110:(36):1455762
    [Crossref] [Google Scholar]
  41. Fan J, Li Q, Wang Y. 2017.. Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions. . J. R. Stat. Soc. Ser. B 79:(1):24765
    [Crossref] [Google Scholar]
  42. Gao C, Liu J, Yao Y, Zhu W. 2019.. Robust estimation and generative adversarial nets. . arXiv:1810.02030 [stat.ML]
  43. Gao C, Yao Y, Zhu W. 2020.. Generative adversarial nets for robust scatter estimation: a proper scoring rule perspective. . J. Mach. Learn. Res. 21:(160):160
    [Google Scholar]
  44. Georgiev K, Hopkins S. 2022.. Privacy induces robustness: information-computation gaps and sparse mean estimation. . In NIPS'22: Proceedings of the 36th International Conference on Neural Information Processing Systems, ed. S Koyejo, S Mohamed, A Agarwal, D Belgrave, K Cho, A Oh , pp. 682942. Red Hook, NY:: Curran
    [Google Scholar]
  45. Goodfellow IJ, Shlens J, Szegedy C. 2015.. Explaining and harnessing adversarial examples. . arXiv:1412.6572 [stat.ML]
  46. Hampel FR. 1968.. Contributions to the theory of robust estimation. PhD thesis , Univ. Calif., Berkeley, CA:
    [Google Scholar]
  47. Hampel FR. 1974.. The influence curve and its role in robust estimation. . J. Am. Stat. Assoc. 69:(346):38393
    [Crossref] [Google Scholar]
  48. Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA. 2011.. Robust Statistics: The Approach Based on Influence Functions. New York:: Wiley
    [Google Scholar]
  49. Hastie T, Tibshirani R, Wainwright M. 2015.. Statistical Learning with Sparsity. Boca Raton, FL:: Chapman and Hall/CRC
    [Google Scholar]
  50. Hopkins SB. 2020.. Mean estimation with sub-Gaussian rates in polynomial time. . Ann. Stat. 48:(2):1193213
    [Crossref] [Google Scholar]
  51. Hopkins SB, Kamath G, Majid M, Narayanan S. 2023.. Robustness implies privacy in statistical estimation. . In STOC 2023: Proceedings of the 55th Annual ACM Symposium on Theory of Computing, pp. 497506. New York:: ACM
    [Google Scholar]
  52. Hsu D, Sabato S. 2016.. Loss minimization and parameter estimation with heavy tails. . J. Mach. Learn. Res. 17::54382
    [Google Scholar]
  53. Huber PJ. 1964.. Robust estimation of a location parameter. . Ann. Math. Stat. 35::73101
    [Crossref] [Google Scholar]
  54. Huber PJ. 1965.. A robust version of the probability ratio test. . Ann. Math. Stat. 36::175358
    [Crossref] [Google Scholar]
  55. Huber PJ. 1972.. The 1972 Wald lecture robust statistics: a review. . Ann. Math. Stat. 43:(4):104167
    [Crossref] [Google Scholar]
  56. Huber PJ. 2002.. John W. Tukey's contributions to robust statistics. . Ann. Stat. 30::164048
    [Crossref] [Google Scholar]
  57. Huber PJ, Ronchetti EM. 2011.. Robust Statistics. New York:: Wiley
    [Google Scholar]
  58. Huber PJ, Strassen V. 1973.. Minimax tests and the Neyman-Pearson lemma for capacities. . Ann. Stat. 1::25163
    [Google Scholar]
  59. Ioannou E, Pydi MS, Loh P. 2023.. Robust empirical risk minimization via Newton's method. . Econom. Stat. In press
    [Google Scholar]
  60. Javanmard A, Montanari A. 2014.. Confidence intervals and hypothesis testing for high-dimensional regression. . J. Mach. Learn. Res. 15:(1):2869909
    [Google Scholar]
  61. Jordan MI, Mitchell TM. 2015.. Machine learning: trends, perspectives, and prospects. . Science 349:(6245):25560
    [Crossref] [Google Scholar]
  62. Kumar RSS, Nyström M, Lambert J, Marshall A, Goertzel M, et al. 2020.. Adversarial machine learning—industry perspectives. . In 2020 IEEE Security and Privacy Workshops (SPW), pp. 6975. Piscataway, NJ:: IEEE
    [Google Scholar]
  63. Lai KA, Rao AB, Vempala S. 2016.. Agnostic estimation of mean and covariance. . In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pp. 66574. Piscataway, NJ:: IEEE
    [Google Scholar]
  64. Lecué G, Lerasle M. 2020.. Robust machine learning by median-of-means: theory and practice. . Ann. Stat. 48::90631
    [Crossref] [Google Scholar]
  65. Lei J. 2011.. Differentially private M-estimators. . In NIPS'11: Proceedings of the 24th International Conference on Neural Information Processing Systems, ed. J Shawe-Taylor, RS Zemel, PL Bartlett, F Pereira, KQ Weinberger , pp. 36169. Red Hook, NY:: Curran
    [Google Scholar]
  66. Li M, Berrett TB, Yu Y. 2023.. On robustness and local differential privacy. . Ann. Stat. 51:(2):71737
    [Crossref] [Google Scholar]
  67. Liu A, Moitra A. 2022.. Minimax rates for robust community detection. . In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pp. 82331. Piscataway, NJ:: IEEE
    [Google Scholar]
  68. Liu X, Jain P, Kong W, Oh S, Suggala AS. 2023.. Near optimal private and robust linear regression. . arXiv:2301.13273 [cs.LG]
  69. Liu X, Kong W, Kakade S, Oh S. 2021.. Robust and differentially private mean estimation. . In NIPS'21: Proceedings of the 35th International Conference on Neural Information Processing Systems, ed. M Ranzato, A Beygelzimer, Y Dauphin, PS Liang, J Wortman Vaughan , pp. 3887901. Red Hook, NY:: Curran
    [Google Scholar]
  70. Liu Z, Loh P. 2023.. Robust W-GAN-based estimation under Wasserstein contamination. . Inform. Inference 12:(1):31262
    [Crossref] [Google Scholar]
  71. Loh P. 2017.. Statistical consistency and asymptotic normality for high-dimensional robust M-estimators. . Ann. Stat. 45:(2):86696
    [Crossref] [Google Scholar]
  72. Loh P. 2021.. Scale calibration for high-dimensional robust regression. . Electron. J. Stat. 15:(2):593394
    [Crossref] [Google Scholar]
  73. Lugosi G, Mendelson S. 2019a.. Mean estimation and regression under heavy-tailed distributions: a survey. . Found. Comput. Math. 19:(5):114590
    [Crossref] [Google Scholar]
  74. Lugosi G, Mendelson S. 2019b.. Sub-Gaussian estimators of the mean of a random vector. . Ann. Stat. 47:(2):78394
    [Crossref] [Google Scholar]
  75. Lugosi G, Mendelson S. 2021.. Robust multivariate mean estimation: the optimality of trimmed mean. . Ann. Stat. 49:(1):393410
    [Crossref] [Google Scholar]
  76. Lustig M, Donoho D, Pauly JM. 2007.. Sparse MRI: the application of compressed sensing for rapid MR imaging. . Magn. Reson. Med. 58:(6):118295
    [Crossref] [Google Scholar]
  77. Maronna R, Bustos O, Yohai V. 2006.. Bias-and efficiency-robustness of general M-estimators for regression with random carriers. . In Smoothing Techniques for Curve Estimation: Proceedings of a Workshop held in Heidelberg, April 2–4, 1979, ed. T Gasser, M Rosenblatt , pp. 91116. New York:: Springer
    [Google Scholar]
  78. Maronna RA, Martin RD, Yohai VJ, Salibián-Barrera M. 2019.. Robust Statistics: Theory and Methods (with R). New York:: Wiley
    [Google Scholar]
  79. Minsker S. 2015.. Geometric median and robust estimation in Banach spaces. . Bernoulli 21:(4):230835
    [Crossref] [Google Scholar]
  80. Minsker S. 2018.. Sub-Gaussian estimators of the mean of a random matrix with heavy-tailed entries. . Ann. Stat. 46:(6A):2871903
    [Crossref] [Google Scholar]
  81. Minsker S. 2023.. U-statistics of growing order and sub-Gaussian mean estimators with sharp constants. . arXiv:2202.11842 [math.ST]
  82. Moitra A, Perry W, Wein AS. 2016.. How robust are reconstruction thresholds for community detection?. In STOC '16: Proceedings of the Forty-Eighth Annual ACM Symposium on Theory of Computing, pp. 82841. New York:: ACM
    [Google Scholar]
  83. Montanari A, Sen S. 2016.. Semidefinite programs on sparse random graphs and their application to community detection. . In STOC '16: Proceedings of the Forty-Eighth Annual ACM Symposium on Theory of Computing, pp. 81427. New York:: ACM
    [Google Scholar]
  84. Nemirovskij AS, Yudin DB. 1983.. Problem complexity and method efficiency in optimization. . SIAM Rev. 27:(2):26465
    [Google Scholar]
  85. Öllerer V, Croux C, Alfons A. 2015.. The influence function of penalized regression estimators. . Statistics 49:(4):74165
    [Crossref] [Google Scholar]
  86. Pensia A, Jog V, Loh P. 2021.. Robust regression with covariate filtering: Heavy tails and adversarial contamination. . arXiv:2009.12976 [math.ST]
  87. Prasad A, Srinivasan V, Balakrishnan S, Ravikumar P. 2020a.. On learning Ising models under Huber's contamination model. . In NIPS'20: Proceedings of the 34th International Conference on Neural Information Processing Systems, ed. H Larochelle, M Ranzato, R Hadsell, MF Balcan, H Lin , pp. 1632738. Red Hook, NY:: Curran
    [Google Scholar]
  88. Prasad A, Suggala AS, Balakrishnan S, Ravikumar P. 2020b.. Robust estimation via robust gradient estimation. . J. R. Stat. Soc. Ser. B 82:(3):60127
    [Crossref] [Google Scholar]
  89. Robin Medical Inc. 2019.. Motion artifact correction with MotionScout (under development). Fact Sheet, Robin Medical, Inc., Baltimore, MD:. https://www.robinmedical.com/motion_artifact.html
    [Google Scholar]
  90. Ronchetti E. 1979.. Robusthatseigenschaften von Tests. PhD thesis , ETH Zürich, Zürich, Switz:.
    [Google Scholar]
  91. Rousseeuw PJ. 1984.. Least median of squares regression. . J. Am. Stat. Assoc. 79:(388):87180
    [Crossref] [Google Scholar]
  92. Smucler E, Yohai VJ. 2017.. Robust and sparse estimators for linear regression models. . Comput. Stat. Data Anal. 111::11630
    [Crossref] [Google Scholar]
  93. Stephan L, Massoulié L. 2019.. Robustness of spectral methods for community detection. . Proc. Mach. Learn. Res. 99::283160
    [Google Scholar]
  94. Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, et al. 2014.. Intriguing properties of neural networks. . arXiv:1312.6199 [cs.CV]
  95. Tukey JW, McLaughlin DH. 1963.. Less vulnerable confidence and significance procedures for location based on a single sample: trimming/Winsorization 1. . Sankhyā 25:(3):33152
    [Google Scholar]
  96. Van de Geer S, Bühlmann P, Ritov Y, Dezeure R. 2014.. On asymptotically optimal confidence regions and tests for high-dimensional models. . Ann. Stat. 42:(3):1166202
    [Crossref] [Google Scholar]
  97. Villani C. 2009.. Optimal Transport: Old and New. New York:: Springer
    [Google Scholar]
  98. Wainwright MJ. 2019.. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge, UK:: Cambridge Univ. Press
    [Google Scholar]
  99. Wang H, Li G, Jiang G. 2007.. Robust regression shrinkage and consistent variable selection through the LAD-lasso. . J. Bus. Econ. Stat. 25:(3):34755
    [Crossref] [Google Scholar]
  100. Wang L, Peng B, Bradic J, Li R, Wu Y. 2020.. A tuning-free robust and efficient approach to high-dimensional regression. . J. Am. Stat. Assoc. 115:(532):170014
    [Crossref] [Google Scholar]
  101. White H. 1980.. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. . Econometrica 48:(4):81738
    [Crossref] [Google Scholar]
  102. Wu K, Ding GW, Huang R, Yu Y. 2020.. On minimax optimality of GANs for robust mean estimation. . Proc. Mach. Learn. Res. 108::454151
    [Google Scholar]
  103. Zhang CH, Zhang SS. 2014.. Confidence intervals for low dimensional parameters in high dimensional linear models. . J. R. Stat. Soc. Ser. B 76:(1):21742
    [Crossref] [Google Scholar]
  104. Zhu B, Jiao J, Steinhardt J. 2020.. When does the Tukey median work?. In 2020 IEEE International Symposium on Information Theory (ISIT), pp. 12016. Piscataway, NJ:: IEEE
    [Google Scholar]
  105. Zhu B, Jiao J, Steinhardt J. 2022.. Generalized resilience and robust statistics. . Ann. Stat. 50:(4):225683
    [Crossref] [Google Scholar]
/content/journals/10.1146/annurev-statistics-112723-034446
Loading
/content/journals/10.1146/annurev-statistics-112723-034446
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error