We present an expository, general analysis of valid post-selection or post-regularization inference about a low-dimensional target parameter in the presence of a very high-dimensional nuisance parameter that is estimated using selection or regularization methods. Our analysis provides a set of high-level conditions under which inference for the low-dimensional parameter based on testing or point estimation methods will be regular despite selection or regularization biases occurring in the estimation of the high-dimensional nuisance parameter. A key element is the use of so-called immunized or orthogonal estimating equations that are locally insensitive to small mistakes in the estimation of the high-dimensional nuisance parameter. As an illustration, we analyze affine-quadratic models and specialize these results to a linear instrumental variables model with many regressors and many instruments. We conclude with a review of other developments in post-selection inference and note that many can be viewed as special cases of the general encompassing framework of orthogonal estimating equations provided in this article.


Article metrics loading...

Loading full text...

Full text loading...


Literature Cited

  1. Belloni A, Chen D, Chernozhukov V, Hansen C. 2012. Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica 80:2369–429 [Google Scholar]
  2. Belloni A, Chernozhukov V. 2011. High-dimensional sparse econometric models: an introduction. In Inverse Problems and High-Dimensional Estimation: Stats in the Château Summer School, August 31–September 2, 2009, ed. P Alquier, E Gautier, G Stoltz, pp. 121–56. New York: Springer
  3. Belloni A, Chernozhukov V. 2013. Least squares after model selection in high-dimensional sparse models. Bernoulli 19:521–47 [Google Scholar]
  4. Belloni A, Chernozhukov V, Fernández-Val I, Hansen C. 2013a. Program evaluation with high-dimensional data. arXiv:1311.2645 [math.ST]
  5. Belloni A, Chernozhukov V, Hansen C. 2010. LASSO methods for Gaussian instrumental variables models. arXiv:1012.1297 [stat.ME]
  6. Belloni A, Chernozhukov V, Hansen C. 2013b. Inference for high-dimensional sparse econometric models. In Advances in Economics and Econometrics: 10th World Congress, Vol. 3: Econometrics, ed. D Acemoglu, M Arellano, E Dekel, pp. 245–95. Cambridge, UK: Cambridge Univ. Press
  7. Belloni A, Chernozhukov V, Hansen C. 2014a. Inference on treatment effects after selection amongst high-dimensional controls. Rev. Econ. Stud. 81:608–50 [Google Scholar]
  8. Belloni A, Chernozhukov V, Hansen C, Kozbur D. 2014b. Inference in high dimensional panel models with an application to gun control. arXiv:1411.6507 [stat.ME]
  9. Belloni A, Chernozhukov V, Kato K. 2013c. Robust inference in approximately sparse quantile regression models (with an application to malnutrition). arXiv:1312.7186 [math.ST]
  10. Belloni A, Chernozhukov V, Kato K. 2013d. Uniform post selection inference for LAD regression models and other Z-estimation problems. arXiv:1304.0282 [math.ST]
  11. Belloni A, Chernozhukov V, Wang L. 2011. Square-root-LASSO: pivotal recovery of sparse signals via conic programming. Biometrika 98:791–806 [Google Scholar]
  12. Belloni A, Chernozhukov V, Wei Y. 2013e. Honest confidence regions for logistic regression with a large number of controls. arXiv:1304.3969 [stat.ME]
  13. Berk R, Brown L, Buja A, Zhang K, Zhao L. 2013. Valid post-selection inference. Ann. Stat. 41:802–37 [Google Scholar]
  14. Berry S, Levinsohn J, Pakes A. 1995. Automobile prices in market equilibrium. Econometrica 63:841–90 [Google Scholar]
  15. Bickel PJ. 1982. On adaptive estimation. Ann. Statist. 10:647–71 [Google Scholar]
  16. Bickel PJ, Ritov Y, Tsybakov AB. 2009. Simultaneous analysis of Lasso and Dantzig selector. Ann. Stat. 37:1705–32 [Google Scholar]
  17. Bühlmann P, van de Geer S. 2011. Statistics for High-Dimensional Data: Methods, Theory and Applications New York: Springer
  18. Candès E, Tao T. 2007. The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat. 35:2313–51 [Google Scholar]
  19. Carrasco M. 2012. A regularization approach to the many instruments problem. J. Econom. 170:383–98 [Google Scholar]
  20. Carrasco M, Tchuente G. 2015. Regularized LIML with many instruments. J. Econom. 186:427–42 [Google Scholar]
  21. Chamberlain G. 1987. Asymptotic efficiency in estimation with conditional moment restrictions. J. Econom. 34:305–34 [Google Scholar]
  22. Chamberlain G, Imbens G. 2004. Random effects estimators with many instrumental variables. Econometrica 72:295–306 [Google Scholar]
  23. Chao JC, Swanson NR, Hausman JA, Newey WK, Woutersen T. 2012. Asymptotic distribution of JIVE in a heteroskedastic IV regression with many instruments. Econom. Theory 28:42–86 [Google Scholar]
  24. Chen LHY, Fang X. 2011. Multivariate normal approximation by Stein’s method: the concentration inequality approach. arXiv:1111.4073 [math.PR]
  25. Chen X, Linton O, Keilegom IV. 2003. Estimation of semiparametric models when the criterion function is not smooth. Econometrica 71:1591–608 [Google Scholar]
  26. Chernozhukov V, Chetverikov D, Kato K. 2013. Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Stat. 41:2786–819 [Google Scholar]
  27. Chernozhukov V, Liu H, Lu J, Ning Y. 2014. Statistical inference in high-dimensional sparse models using generalized method of moments. Unpublished manuscript, Mass. Inst. Technol., Cambridge, MA, Princeton Univ., Princeton, NJ
  28. Dudley RM. 2002. Real Analysis and Probability Cambridge, UK: Cambridge Univ. Press
  29. Fan J, Li R. 2001. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96:1348–60 [Google Scholar]
  30. Fan J, Lv J. 2010. A selective overview of variable selection in high dimensional feature space. Stat. Sin. 20:101–48 [Google Scholar]
  31. Farrell MH. 2014. Robust inference on average treatment effects with possibly more covariates than observations. arXiv:1309.4686 [math.ST]
  32. Fithian W, Sun D, Taylor J. 2014. Optimal inference after model selection. arXiv:1410.2597v1 [math.ST]
  33. Frank IE, Friedman JH. 1993. A statistical view of some chemometrics regression tools. Technometrics 35:109–35 [Google Scholar]
  34. Gautier E, Tsybakov AB. 2011. High-dimensional instrumental variables regression and confidence sets. arXiv:1105.2454v4 [math.ST]
  35. Gillen BJ, Shum M, Moon HR. 2014. Demand estimation with high-dimensional product charateristics. Adv. Econom. 34:301–23 [Google Scholar]
  36. G’Sell MG, Taylor J, Tibshirani R. 2013. Adaptive testing for the graphical lasso. arXiv:1307.4765 [math.ST]
  37. Hansen C, Kozbur D. 2014. Instrumental variables estimation with many weak instruments using regularized JIVE. J. Econom. 182:290–308 [Google Scholar]
  38. Hastie T, Tibshirani R, Friedman J. 2009. Elements of Statistical Learning: Data Mining, Inference, and Prediction New York: Springer
  39. Huber PJ. 1964. The behavior of maximum likelihood estimates under nonstandard conditions. Proc. 5th Berkeley Symp. Neyman J. 221–23 Berkeley: Univ. Calif. Press [Google Scholar]
  40. Javanmard A, Montanari A. 2014. Confidence intervals and hypothesis testing for high-dimensional regression. arXiv:1306.3171v2 [stat.ME]
  41. Jing B-Y, Shao Q-M, Wang Q. 2003. Self-normalized Cramer-type large deviations for independent random variables. Ann. Probab. 31:2167–215 [Google Scholar]
  42. Kozbur D. 2014. Inference in nonparametric models with a high-dimensional component. Work. Pap., ETH Zürich
  43. Lee JD, Sun DL, Sun Y, Taylor JE. 2013. Exact post-selection inference, with application to the lasso. arXiv:1311.6238 [math.ST]
  44. Lee JD, Taylor JE. 2014. Exact post model selection inference for marginal screening. arXiv:1402.5596 [stat.ME]
  45. Leeb H, Pötscher BM. 2008a. Recent developments in model selection and related areas. Econom. Theory 24:319–22 [Google Scholar]
  46. Leeb H, Pötscher BM. 2008b. Sparse estimators and the oracle property, or the return of Hodges’ estimator. J. Econom. 142:201–11 [Google Scholar]
  47. Lockhart R, Taylor JE, Tibshirani RJ, Tibshirani R. 2014. A significance test for the lasso. Ann. Stat. 42:413–68 [Google Scholar]
  48. Loftus JR, Taylor JE. 2014. A significance test for forward stepwise model selection. arXiv:1405.3920 [stat.ME]
  49. Meinshausen N, Yu B. 2009. Lasso-type recovery of sparse representations for high-dimensional data. Ann. Stat. 37:2246–70 [Google Scholar]
  50. Neyman J. 1959. Optimal asymptotic tests of composite statistical hypotheses. Probability and Statistics: The Harald Cramer Volume Grenander U. 213–34 New York: Wiley [Google Scholar]
  51. Neyman J. 1979. tests and their use. Sankhya 41:1–21 [Google Scholar]
  52. Ning Y, Liu H. 2014. SPARC: optimal estimation and asymptotic inference under semiparametric sparsity. arXiv:1412.2295 [stat.ML]
  53. Okui R. 2011. Instrumental variable estimation in the presence of many moment conditions. J. Econom. 165:70–86 [Google Scholar]
  54. Pakes A, Pollard D. 1989. Simulation and asymptotics of optimization estimators. Econometrica 57:1027–57 [Google Scholar]
  55. Robins JM, Rotnitzky A. 1995. Semiparametric efficiency in multivariate regression models with missing data. J. Am. Stat. Assoc. 90:122–29 [Google Scholar]
  56. Rudelson M, Vershynin R. 2008. On sparse reconstruction from Fourier and Gaussian measurements. Commun. Pure Appl. Math. 61:1025–45 [Google Scholar]
  57. Rudelson M, Zhou S. 2011. Reconstruction from anisotropic random measurements. arXiv:1106.1151 [math.ST]
  58. Taylor J, Lockhart R, Tibshirani RJ, Tibshirani R. 2014. Exact post-selection inference for forward stepwise and least angle regression. arXiv:1401.3889 [stat.ME]
  59. Tibshirani R. 1996. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58:267–88 [Google Scholar]
  60. van de Geer S, Bühlmann P, Ritov Y, Dezeure R. 2014. On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Stat. 42:1166–202 [Google Scholar]
  61. van de Geer S, Nickl R. 2013. Confidence sets in sparse regression. Ann. Stat. 41:2852–76 [Google Scholar]
  62. van der Vaart AW. 1998. Asymptotic Statistics Cambridge, UK: Cambridge Univ. Press
  63. Voorman A, Shojaie A, Witten D. 2014. Inference in high dimensions with the penalized score test. arXiv:1401.2678 [stat.ME]
  64. Yang Z, Ning Y, Liu H. 2014. On semiparametric exponential family graphical models. arXiv:1412.8697 [stat.ML]
  65. Zhang C-H, Zhang SS. 2014. Confidence intervals for low dimensional parameters in high dimensional linear models. J. R. Stat. Soc. B 76:217–42 [Google Scholar]

Data & Media loading...

Supplementary Data

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error