1932

Abstract

This article reviews the literature on sparse high-dimensional models and discusses some applications in economics and finance. Recent developments in theory, methods, and implementations in penalized least-squares and penalized likelihood methods are highlighted. These variable selection methods are effective in sparse high-dimensional modeling. The limits of dimensionality that regularization methods can handle, the role of penalty functions, and their statistical properties are detailed. Some recent advances in sparse ultra-high-dimensional modeling are also briefly discussed.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-economics-061109-080451
2011-09-04
2025-06-19
Loading full text...

Full text loading...

/deliver/fulltext/economics/3/1/annurev-economics-061109-080451.html?itemId=/content/journals/10.1146/annurev-economics-061109-080451&mimeType=html&fmt=ahah

Literature Cited

  1. Antoniadis A. 1996. Smoothing noisy data with tapered coiflets series. Scand. J. Stat. 23:31330 [Google Scholar]
  2. Antoniadis A, Fan J. 2001. Regularization of wavelet approximations. J. Am. Stat. Assoc. 96:93967 [Google Scholar]
  3. Bai J. 2003. Inferential theory for factor models of large dimensions. Econometrica 71:13571 [Google Scholar]
  4. Bai J, Ng S. 2008. Large dimensional factor analysis. Found. Trends Econom. 3:(2)89163 [Google Scholar]
  5. Barron A, Birge L, Massart P. 1999. Risk bounds for model selection via penalization. Probab. Theory Relat. Fields 113:301413 [Google Scholar]
  6. Belloni A, Chernozhukov V. 2009. Post-L1-penalized estimators in high-dimensional linear regression models Unpublished manuscript, Duke Univ./Mass. Inst. Technol [Google Scholar]
  7. Belloni A, Chernozhukov V. 2011. 1 -penalized quantile regression in high-dimensional sparse models. Ann. Stat 39:(1)82130 [Google Scholar]
  8. Bernanke B, Boivin J, Eliasz PS. 2005. Measuring the effects of monetary policy: a factor-augmented vector autoregressive (FAVAR) approach. Q. J. Econ. 120:(1)387422 [Google Scholar]
  9. Bickel PJ. 2008. Discussion of “Sure independence screening for ultrahigh dimensional feature space.”. J. R. Stat. Soc. B 70:88384 [Google Scholar]
  10. Bickel PJ, Levina E. 2008a. Regularized estimation of large covariance matrices. Ann. Stat. 36:199227 [Google Scholar]
  11. Bickel PJ, Levina E. 2008b. Covariance regularization by theresholding. Ann. Stat. 36:2577604 [Google Scholar]
  12. Bickel PJ, Ritov Y, Tsybakov A. 2009. Simultaneous analysis of Lasso and Dantzig selector. Ann. Stat. 37:170532 [Google Scholar]
  13. Bradic J, Fan J, Wang W. 2011. Penalized composite quasi-likelihood for ultrahigh-dimensional variable selection. J. R. Stat. Soc. B. 73: In press [Google Scholar]
  14. Breiman L. 1995. Better subset regression using the non-negative garrote. Technometrics 37:37384 [Google Scholar]
  15. Breiman L. 1996. Heuristics of instability and stabilization in model selection. Ann. Stat. 24:235083 [Google Scholar]
  16. Brodie J, Daubechies I, De Mol C, Giannone D, Loris I. 2009. Sparse and stable Markowitz portfolios. Proc. Natl. Acad. Sci. USA 106:(30)1226772 [Google Scholar]
  17. Cai T, Zhang C-H, Zhou H. 2010. Optimal rates of convergence for covariance matrix estimation. Ann. Stat. 38:211844 [Google Scholar]
  18. Calomiris CW, Longhofer SD, Miles W. 2008. The foreclosure-house price nexus: lessons from the 2007–2008 housing turmoil NBER Work. Pap. 14294 [Google Scholar]
  19. Campbell J, Lo A, MacKinlay C. 1997. The Econometrics of Financial Markets Princeton, NJ: Princeton Univ. Press [Google Scholar]
  20. Candes E, Tao T. 2007. The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat. 35:2313404 [Google Scholar]
  21. Chamberlain G. 1983. Funds, factors and diversification in arbitrage pricing theory. Econometrica 51:130523 [Google Scholar]
  22. Chamberlain G, Rothschild M. 1983. Arbitrage, factor structure, and mean-variance analysis on large asset markets. Econometrica 51:1281304 [Google Scholar]
  23. Cochrane JH. 2005. Asset Pricing Princeton, NJ: Princeton Univ. Press. Rev. ed [Google Scholar]
  24. Cox DR. 1972. Regression models and life-tables. J. R. Stat. Soc. B 34:187220 [Google Scholar]
  25. Craven P, Wahba G. 1978. Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math. 31:(4)377403 [Google Scholar]
  26. Daubechies I, Defrise M, De Mol C. 2004. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57:141357 [Google Scholar]
  27. DeMiguel V, Garlappi L, Nogales FJ, Uppal R. 2009. A generalized approach to portfolio optimization: improving performance by constraining portfolio norms. Manage. Sci. 55:(5)798812 [Google Scholar]
  28. Donoho DL, Elad M, Temlyakov V. 2006. Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Trans. Inf. Theory 52:618 [Google Scholar]
  29. Duan J-C, Sun J, Wang T. 2010. Multiperiod corporate default prediction: a forward intensity approach Unpublished manuscript, Natl. Univ. Singapore [Google Scholar]
  30. Duffie D, Eckner A, Horel G, Saita L. 2009. Frailty correlated default. J. Finance 64:2089123 [Google Scholar]
  31. Efron B, Hastie T, Johnstone I, Tibshirani R. 2004. Least angle regression. Ann. Stat. 32:40799 [Google Scholar]
  32. El Karoui N. 2008. Operator norm consistent estimation of large dimensional sparse covariance matrices. Ann. Stat. 36:271756 [Google Scholar]
  33. Engle RF, Watson MW. 1981. A one-factor multivariate time series model of metropolitan wage rates. J. Am. Stat. Assoc. 76:77481 [Google Scholar]
  34. Fama E, French K. 1992. The cross-section of expected stock returns. J. Finance 47:42765 [Google Scholar]
  35. Fama E, French K. 1993. Common risk factors in the returns on stocks and bonds. J. Financ. Econ. 33:356 [Google Scholar]
  36. Fan J, Fan Y. 2008. High-dimensional classification using features annealed independence rules. Ann. Stat. 36:260537 [Google Scholar]
  37. Fan J, Fan Y, Lv J. 2008. High dimensional covariance matrix estimation using a factor model. J. Econom. 147:18697 [Google Scholar]
  38. Fan J, Feng Y, Song R. 2011a. Nonparametric independence screening in sparse ultra-high dimensional additive models. J. Am. Stat. Assoc In press [Google Scholar]
  39. Fan J, Feng Y, Wu Y. 2009a. Network exploration via the adaptive LASSO and SCAD penalties. Ann. Appl. Stat. 3:52141 [Google Scholar]
  40. Fan J, Guo S, Hao N. 2011b. Variance estimation using refitted cross-validation in ultrahigh dimensional regression. J. R. Stat. Soc. In press [Google Scholar]
  41. Fan J, Li R. 2001. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96:134860 [Google Scholar]
  42. Fan J, Li R. 2002. Variable selection for Cox's proportional hazards model and frailty model. Ann. Stat. 30:7499 [Google Scholar]
  43. Fan J, Lv J. 2008. Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. B 70:849911 [Google Scholar]
  44. Fan J, Lv J. 2011. Non-concave penalized likelihood with NP-dimensionality. IEEE Trans. Inf. Theory In press [Google Scholar]
  45. Fan J, Lv J. 2010. A selective overview of variable selection in high dimensional feature space. Stat. Sinica 20:10148 [Google Scholar]
  46. Fan J, Peng H. 2004. Nonconcave penalized likelihood with diverging number of parameters. Ann. Stat. 32:92861 [Google Scholar]
  47. Fan J, Samworth R, Wu Y. 2009b. Ultrahigh dimensional variable selection: beyond the linear model. J. Mach. Learn. Res. 10:182953 [Google Scholar]
  48. Fan J, Song R. 2010. Sure independence screening in generalized linear models with NP-dimensionality. Ann. Stat 38:3567604 [Google Scholar]
  49. Fan J, Zhang J, Yu K. 2011c. Asset allocation and risk assessment with gross exposure constraints for vast portfolios Manuscript submitted [Google Scholar]
  50. Frank IE, Friedman JH. 1993. A statistical view of some chemometrics regression tools. Technometrics 35:10948 [Google Scholar]
  51. Friedman J, Hastie T, Tibshirani R. 2008. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9:43241 [Google Scholar]
  52. Fu WJ. 1998. Penalized regression: the bridge versus the lasso. J. Comput. Graph. Stat. 7:397416 [Google Scholar]
  53. Hall P, Marron JS, Neeman A. 2005. Geometric representation of high dimension, low sample size data. J. R. Stat. Soc. B 67:42744 [Google Scholar]
  54. Hall P, Miller H. 2009. Using generalized correlation to effect variable selection in very high dimensional problems. J. Comput. Graph. Stat. 18:(3)53350 [Google Scholar]
  55. Hall P, Pittelkow Y, Ghosh M. 2008. Theoretic measures of relative performance of classifiers for high dimensional data with small sample sizes. J. R. Stat. Soc. B 70:15873 [Google Scholar]
  56. Hall P, Titterington DM, Xue J-H. 2009. Tilting methods for assessing the influence of components in a classifier. J. R. Stat. Soc. B 71:783803 [Google Scholar]
  57. Hastie T, Tibshirani R, Friedman J. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction New York: Springer-Verlag, 2nd ed. [Google Scholar]
  58. Himmelberg C, Mayer C, Sinai T. 2005. Assessing high house prices: bubbles, fundamentals and misperceptions. J. Econ. Perspect. 19:(4)6792 [Google Scholar]
  59. Huang J, Horowitz J, Ma S. 2008. Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Ann. Stat. 36:587613 [Google Scholar]
  60. Huang JZ, Liu N, Pourahmadi M, Liu L. 2006. Covariance matrix selection and estimation via penalised normal likelihood. Biometrika 93:8598 [Google Scholar]
  61. Hunter DR, Li R. 2005. Variable selection using MM algorithms. Ann. Stat. 33:161742 [Google Scholar]
  62. Jagannathan R, Ma T. 2003. Risk reduction in large portfolios: why imposing the wrong constraints helps. J. Finance 58:(4)165183 [Google Scholar]
  63. Jarrow RA. 2009. Credit risk models. Annu. Rev. Financ. Econ. 1:3768 [Google Scholar]
  64. Johnstone IM. 2001. On the distribution of the largest eigenvalue in principal components analysis. Ann. Stat. 29:295327 [Google Scholar]
  65. Johnstone IM, Lu AY. 2004. Sparse principal components analysis Unpublished manuscript, Stanford Univ./Renaissance Technol [Google Scholar]
  66. Kim Y, Choi H, Oh HS. 2008. Smoothly clipped absolute deviation on high dimensions. J. Am. Stat. Assoc. 103:166573 [Google Scholar]
  67. Kim Y, Kwon S. 2009. On the global optimum of the SCAD penalized estimator Unpublished manuscript, Seoul Natl. Univ [Google Scholar]
  68. Koltchinskii V. 2008. Sparse recovery in convex hulls via entropy penalization. Ann. Stat. 37:(3)133259 [Google Scholar]
  69. Lam C, Fan J. 2009. Sparsistency and rates of convergence in large covariance matrices estimation. Ann. Stat. 37:425478 [Google Scholar]
  70. Lando D. 1998. On Cox processes and credit risky securities. Rev. Deriv. Res. 2:99120 [Google Scholar]
  71. Ledoit O, Wolf M. 2004. A well conditioned estimator for large-dimensional covariance matrices. J. Multivar. Anal. 88:365411 [Google Scholar]
  72. Levina E, Rothman AJ, Zhu J. 2008. Sparse estimation of large covariance matrices via a nested Lasso penalty. Ann. Appl. Stat. 2:24563 [Google Scholar]
  73. Lv J, Fan Y. 2009. A unified approach to model selection and sparse recovery using regularized least squares. Ann. Stat. 37:3498528 [Google Scholar]
  74. Markowitz HM. 1952. Portfolio selection. J. Finance 7:7791 [Google Scholar]
  75. Markowitz HM. 1959. Portfolio Selection: Efficient Diversification of Investments New York: Wiley & Sons [Google Scholar]
  76. Meier L, van de Geer S, Bühlmann P. 2008. The group lasso for logistic regression. J. R. Stat. Soc. B 70:5371 [Google Scholar]
  77. Meinshausen N, Bühlmann P. 2006. High dimensional graphs and variable selection with the Lasso. Ann. Stat. 34:143662 [Google Scholar]
  78. Ng S, Moench E. 2011. A factor analysis of housing market dynamics in the U.S. and the regions. Econom. J. In press [Google Scholar]
  79. Osborne MR, Presnell B, Turlach BA. 2000. On the LASSO and its dual. J. Comput. Graph. Stat. 9:31937 [Google Scholar]
  80. Rapach DE, Strass JK. 2007. Forecasting real housing price growth in the eighth district states. Reg. Econ. Dev. 3:(2)3342 [Google Scholar]
  81. Rosset S, Zhu J. 2007. Piecewise linear regularized solution paths. Ann. Stat. 35:101230 [Google Scholar]
  82. Rothman AJ, Bickel PJ, Levina E, Zhu J. 2008. Sparse permutation invariant covariance estimation. Electron. J. Stat. 2:494515 [Google Scholar]
  83. Sims CA. 1980. Macroeconomics and reality. Econometrica 48:(1)148 [Google Scholar]
  84. Stein C. 1975. Estimation of a covariance matrix Presented as Rietz Lecture, IMS Annu. Meet., 39th Atlanta, Georgia: [Google Scholar]
  85. Stock JH, Watson MW. 2001. Vector autoregressions. J. Econ. Perspect. 15:(4)10115 [Google Scholar]
  86. Stock JH, Watson MW. 2005. Implications of dynamic factor models for VAR analysis NBER Work. Pap. 11467 [Google Scholar]
  87. Stock JH, Watson MW. 2006.Forecasting with many predictors Handbook of Economic Forecasting Vol. 1 Elliott G, Granger C, Timmermann A. 51554 Amsterdam: North-Holland [Google Scholar]
  88. Stock JH, Watson MW. 2010.The evolution of national and regional factors in U.S. housing construction Volatility Time Series Econometrics Bollerslev T, Russell J, Watson M. 3562 New York: Oxford Univ. Press [Google Scholar]
  89. Tibshirani R. 1996. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58:26788 [Google Scholar]
  90. Tibshirani R. 1997. The lasso method for variable selection in the Cox model. Stat. Med. 16:38595 [Google Scholar]
  91. van de Geer S. 2008. High-dimensional generalized linear models and the lasso. Ann. Stat. 36:61445 [Google Scholar]
  92. Wainwright MJ. 2006.Sharp thresholds for high-dimensional and noisy recovery of sparsity Tech. Rep Dep. Stat., Univ. Calif Berkeley: [Google Scholar]
  93. Wang H, Li B, Leng C. 2009. Shrinkage tuning parameter selection with a diverging number of parameters. J. R. Stat. Soc. B 71:(3)67183 [Google Scholar]
  94. Wang H, Li R, Tsai CL. 2007. Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94:55368 [Google Scholar]
  95. Wu TT, Lange K. 2008. Coordinate descent algorithms for lasso penalized regression. Ann. Appl. Stat. 2:22444 [Google Scholar]
  96. Wu WB, Pourahmadi M. 2003. Nonparametric estimation of large covariance matrices of longitudinal data. Biometrika 90:83144 [Google Scholar]
  97. Yuan M, Lin Y. 2006. Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. B 68:4967 [Google Scholar]
  98. Yuan M, Lin Y. 2007. Model selection and estimation in the Gaussian graphical model. Biometrika 94:1935 [Google Scholar]
  99. Zhang CH. 2010. Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38:(2)894942 [Google Scholar]
  100. Zhang HH, Lu W. 2007. Adaptive Lasso for Cox's proportional hazards model. Biometrika 94:691703 [Google Scholar]
  101. Zhao P, Yu B. 2006. On model selection consistency of Lasso. J. Mach. Learn. Res. 7:254163 [Google Scholar]
  102. Zhao SD, Li Y. 2010. Principled sure independence screening for Cox models with ultra-high-dimensional covariates Unpublished manuscript, Harvard Univ [Google Scholar]
  103. Zou H. 2006. The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101:141829 [Google Scholar]
  104. Zou H. 2008. A note on path-based variable selection in the penalized proportional hazards model. Biometrika 95:24147 [Google Scholar]
  105. Zou H, Hastie T. 2005. Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67:30120 [Google Scholar]
  106. Zou H, Hastie T, Tibshirani R. 2006. Sparse principal component analysis. J. Comput. Graph. Stat. 15:26586 [Google Scholar]
  107. Zou H, Li R. 2008. One-step sparse estimates in nonconcave penalized likelihood models. Ann. Stat. 36:150966 [Google Scholar]
/content/journals/10.1146/annurev-economics-061109-080451
Loading
/content/journals/10.1146/annurev-economics-061109-080451
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error