Large factor models use a few latent factors to characterize the co-movement of economic variables in a high-dimensional data set. High dimensionality brings challenges as well as new insights into the advancement of econometric theory. Because of their ability to effectively summarize information in large data sets, factor models have been increasingly used in economics and finance. The factors, estimated from the high-dimensional data, can, for example, help improve forecasting, provide efficient instruments, control for nonlinear unobserved heterogeneity, and capture cross-sectional dependence. This article reviews the theory on estimation and statistical inference of large factor models. It also discusses important applications and highlights future directions.


Article metrics loading...

Loading full text...

Full text loading...


Literature Cited

  1. Ahn S, Horenstein A. 2013. Eigenvalue ratio test for the number of factors. Econometrica 81:1203–27 [Google Scholar]
  2. Ahn S, Lee Y, Schmidt P. 2001. GMM estimation of linear panel data models with time-varying individual effects. J. Econom. 102:219–55 [Google Scholar]
  3. Ahn S, Lee Y, Schmidt P. 2013. Panel data models with multiple time-varying individual effects. J. Econom. 174:1–14 [Google Scholar]
  4. Amengual D, Watson M. 2007. Consistent estimation of the number of dynamic factors in large N and T panel. J. Bus. Econ. Stat. 25:91–96 [Google Scholar]
  5. Anderson TW. 2003. An Introduction to Multivariate Statistical Analysis New York: Wiley, 3rd. ed. [Google Scholar]
  6. Anderson TW, Rubin H. 1956. Statistical inference in factor analysis. Proc. 3rd Berkeley Symp. Math. Stat. Probab. 5 Contrib. Econom. Ind. Res. Psychom. J Neyman 111–50 Berkeley: Univ. Calif. Press [Google Scholar]
  7. Ando T, Bai J. 2015a. A simple new test for slope homogeneity in panel data models with interactive effects. Econ. Lett. 136:112–17 [Google Scholar]
  8. Ando T, Bai J. 2015b. Asset pricing with a general multifactor structure. J. Financ. Econom. 13:556–604 [Google Scholar]
  9. Ando T, Bai J. 2016a. Clustering huge number of financial time series: a panel data approach with high-dimensional predictors and factor structures. J. Am. Stat. Assoc. In press [Google Scholar]
  10. Ando T, Bai J. 2016b. Panel data models with grouped factor structure under unknown group membership. J. Appl. Econom. 136:163–91 [Google Scholar]
  11. Ando T, Tsay R. 2011. Quantile regression models with factor-augmented predictors and information criterion. Econom. J. 14:1–24 [Google Scholar]
  12. Bai J. 2003. Inferential theory for factor models of large dimensions. Econometrica 71:135–72 [Google Scholar]
  13. Bai J. 2004. Estimating cross-section common stochastic trends in nonstationary panel data. J. Econom. 122:137–83 [Google Scholar]
  14. Bai J. 2009. Panel data models with interactive fixed effects. Econometrica 77:1229–79 [Google Scholar]
  15. Bai J, Carrion-I-Silvestre JL. 2009. Structural changes, common stochastic trends, and unit roots in panel data. Rev. Econ. Stud. 76:471–501 [Google Scholar]
  16. Bai J, Carrion-I-Silvestre JL. 2013. Testing panel cointegration with unobservable dynamic common factors that are correlated with the regressors. Econom. J. 16:222–49 [Google Scholar]
  17. Bai J, Li K. 2012. Statistical analysis of factor models of high dimension. Ann. Stat. 40:436–65 [Google Scholar]
  18. Bai J, Li K. 2014. Theory and methods of panel data models with interactive effects. Ann. Stat. 42:142–70 [Google Scholar]
  19. Bai J, Li K. 2016. Maximum likelihood estimation and inference for approximate factor models of high dimension. Rev. Econ. Stat. 98:298–309 [Google Scholar]
  20. Bai J, Li K, Lu L. 2016. Estimation and inference of FAVAR models. J. Bus. Econ. Stat. In press [Google Scholar]
  21. Bai J, Liao Y. 2013. Statistical inferences using large estimated covariances for panel data and factor models Work. Pap., Columbia Univ., New York [Google Scholar]
  22. Bai J, Liao Y. 2016. Efficient estimation of approximate factor models via penalized maximum likelihood. J. Econom. 191:1–18 [Google Scholar]
  23. Bai J, Ng S. 2002. Determining the number of factors in approximate factor models. Econometrica 70:191–221 [Google Scholar]
  24. Bai J, Ng S. 2004. A PANIC attack on unit roots and cointegration. Econometrica 72:1127–77 [Google Scholar]
  25. Bai J, Ng S. 2006a. Confidence intervals for diffusion index forecasts and inference for factor-augmented regressions. Econometrica 74:1133–50 [Google Scholar]
  26. Bai J, Ng S. 2006b. Evaluating latent and observed factors in macroeconomics and finance. J. Econom. 131:507–37 [Google Scholar]
  27. Bai J, Ng S. 2007. Determining the number of primitive shocks in factor models. J. Bus. Econ. Stat. 25:52–60 [Google Scholar]
  28. Bai J, Ng S. 2008. Large Dimensional Factor Analysis. Boston: Now Publ. [Google Scholar]
  29. Bai J, Ng S. 2010. Instrumental variable estimation in a data rich environment. Econom. Theory 26:1577–606 [Google Scholar]
  30. Bai J, Ng S. 2013. Principal components estimation and identification of static factors. J. Econom. 176:18–29 [Google Scholar]
  31. Bai J, Shi S. 2011. Estimating high dimensional covariance matrices and its applications. Ann. Econ. Finance 12:199–215 [Google Scholar]
  32. Bai J, Wang P. 2014. Identification theory for high dimensional static and dynamic factor models. J. Econom. 178:794–804 [Google Scholar]
  33. Bai J, Wang P. 2015. Identification and Bayesian estimation of dynamic factor models. J. Bus. Econ. Stat. 33:221–40 [Google Scholar]
  34. Bailey N, Kapetanios G, Pesaran M. 2016. Exponent of cross-sectional dependence: estimation and inference. J. Appl. Econom. In press [Google Scholar]
  35. Banerjee A, Marcellino M. 2008. Forecasting macroeconomic variables using diffusion indexes in short samples with structural change CEPR Work. Pap. 6706, Cent. Econ. Policy Res., Washington, DC [Google Scholar]
  36. Bates B, Plagborg-Moller M, Stock J, Watson M. 2013. Consistent factor estimation in dynamic factor models with structural instability. J. Econom. 177:289–304 [Google Scholar]
  37. Belloni A, Chen D, Chernozhukov V, Hansen C. 2012. Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica 80:2369–429 [Google Scholar]
  38. Bernanke B, Boivin J, Eliasz P. 2005. Measuring monetary policy: a factor augmented vector autoregressive (FAVAR) approach. Q. J. Econ. 120:387–422 [Google Scholar]
  39. Boivin J, Giannoni M. 2006. DSGE models in a data-rich environment NBER Work. Pap. 12772 [Google Scholar]
  40. Breitung J, Eickmeier S. 2011. Testing for structural breaks in dynamic factor models. J. Econom. 163:71–84 [Google Scholar]
  41. Breitung J, Tenhofen J. 2011. GLS estimation of dynamic factor models. J. Am. Stat. Assoc. 106:1150–66 [Google Scholar]
  42. Cai T, Zhou H. 2012. Optimal rates of convergence for sparse covariance matrix estimation. Ann. Stat. 40:2389–420 [Google Scholar]
  43. Carter C, Kohn R. 1994. On Gibbs sampling for state space models. Biometrika 81:541–53 [Google Scholar]
  44. Chamberlain G, Rothschild M. 1983. Arbitrage, factor structure, and mean-variance analysis on large asset markets. Econometrica 51:1281–304 [Google Scholar]
  45. Chen L, Dolado J, Gonzalo J. 2014. Detecting big structural breaks in large factor models. J. Econom. 180:30–48 [Google Scholar]
  46. Chen NF, Roll R, Ross S. 1986. Economic forces and the stock market. J. Bus. 59:383–403 [Google Scholar]
  47. Cheng X, Liao Z, Schorfheide F. 2016. Shrinkage estimation of high-dimensional factor models with structural instabilities. Rev. Econ. Stud. In press [Google Scholar]
  48. Choi I. 2012. Efficient estimation of factor models. Econom. Theory 28:274–308 [Google Scholar]
  49. Chudik A, Mohaddes K, Pesaran MH, Raissi M. 2016. Long-run effects in large heterogeneous panel data models with cross-sectionally correlated errors. Essays in Honor of Aman Ullah RC Hill, G Gonzalez-Rivera, T-H Lee. Adv. Econom. 36 Bingley, UK: Emerald Insight. In press [Google Scholar]
  50. Chudik A, Pesaran MH. 2015. Long-run effects in large heterogeneous panel data models with cross-sectionally correlated errors. J. Econom. 188:393–420 [Google Scholar]
  51. Chudik A, Pesaran MH, Tosetti E. 2011. Weak and strong cross-section dependence and estimation of large panels. Econom. J. 14:45–90 [Google Scholar]
  52. Connor G, Korajczyk RA. 1986. Performance measurement with the arbitrage pricing theory. J. Financ. Econ. 15:373–94 [Google Scholar]
  53. Corradi V, Swanson NR. 2014. Testing for structural stability of factor augmented forecasting models. J. Econom. 182:100–18 [Google Scholar]
  54. Crucini M, Kose M, Otrok C. 2011. What are the driving forces of international business cycles. Rev. Econ. Dyn. 14:156–75 [Google Scholar]
  55. Del Negro M, Otrok C. 2008. Dynamic factor models with time-varying parameters: measuring changes in international business cycles Staff Rep. 326, Fed. Reserve Bank New York [Google Scholar]
  56. Doan T, Litterman R, Sims C. 1984. Forecasting and policy analysis using realistic prior distributions. Econom. Rev. 3:1–100 [Google Scholar]
  57. Doz C, Giannone D, Reichlin L. 2011. A two-step estimator for large approximate dynamic factor models based on Kalman filtering. J. Econom. 164:188–205 [Google Scholar]
  58. Doz C, Giannone D, Reichlin L. 2012. A quasi-maximum likelihood approach for large approximate dynamic factor models. Rev. Econ. Stat. 94:1014–24 [Google Scholar]
  59. Fama E, French K. 1993. Common risk factors in the returns on stocks and bonds. J. Financ. Econ. 33:3–56 [Google Scholar]
  60. Fan J, Fan Y, Lv J. 2008. High dimensional covariance matrix estimation using a factor model. J. Econom. 147:186–97 [Google Scholar]
  61. Fan J, Liao Y, Mincheva M. 2011. High dimensional covariance matrix estimation in approximate factor models. Ann. Stat. 39:3320–56 [Google Scholar]
  62. Forni M, Giannone D, Lippi M, Reichlin L. 2009. Opening the black box: structural factor models with large cross sections. Econom. Theory 25:1319–47 [Google Scholar]
  63. Forni M, Hallin M, Lippi M, Reichlin L. 2000. The generalized dynamic factor model: identification and estimation. Rev. Econ. Stat. 82:540–54 [Google Scholar]
  64. Forni M, Hallin M, Lippi M, Reichlin L. 2004. The generalized dynamic factor model: consistency and rates. J. Econom. 119:231–55 [Google Scholar]
  65. Forni M, Hallin M, Lippi M, Reichlin L. 2005. The generalized dynamic factor model: one-sided estimation and forecasting. J. Am. Stat. Assoc. 100:830–39 [Google Scholar]
  66. Freyberger J. 2012. Nonparametric panel data models with interactive fixed effects Work. Pap., Univ. Wisconsin–Madison [Google Scholar]
  67. Geweke J. 1977. The dynamic factor analysis of economic time series. Latent Variables in Socio-Economic Models DJ Aigner, AS Goldberger 365–83 Amsterdam: North-Holland [Google Scholar]
  68. Giannone D, Reichlin L, Small D. 2008. Nowcasting: the real-time informational content of macroeconomic data. J. Monet. Econ. 55:665–76 [Google Scholar]
  69. Gregory A, Head A. 1999. Common and country-specific fluctuations in productivity, investment, and the current account. J. Monet. Econ. 44:423–51 [Google Scholar]
  70. Hallin M, Liska R. 2007. The generalized dynamic factor model: determining the number of factors. J. Am. Stat. Assoc. 102:603–17 [Google Scholar]
  71. Hallin M, Liska R. 2011. Dynamic factors in the presence of blocks. J. Econom. 163:29–41 [Google Scholar]
  72. Han X, Inoue A. 2015. Tests for parameter instability in dynamic factor models. Econom. Theory 31:1117–52 [Google Scholar]
  73. Hansen C, Hausman J, Newey WK. 2008. Estimation with many instrumental variables. J. Bus. Econ. Stat. 26:398–422 [Google Scholar]
  74. Harding M, Lamarche C. 2014. Estimating and testing a quantile regression model with interactive effects. J. Econom. 178:101–13 [Google Scholar]
  75. Hausman JA, Newey WK, Woutersen T, Chao JC, Swanson NR. 2010. Instrumental variable estimation with heteroskedasticity and many instruments Work. Pap., Univ. Maryland and Mass. Inst. Technol. [Google Scholar]
  76. Im K, Pesaran M, Shin Y. 2003. Testing for unit roots in heterogeneous panels. J. Econom. 115:53–74 [Google Scholar]
  77. Kapetanios G, Marcellino M. 2010. Factor-GMM estimation with large sets of possibly weak instruments. Comput. Stat. Data Anal. 54:2655–75 [Google Scholar]
  78. Kapetanios G, Pesaran M, Yamagata T. 2011. Panels with non-stationary multifactor error structures. J. Econom. 160:326–48 [Google Scholar]
  79. Kim C, Nelson C. 1999. State Space Models with Regime Switching: Classical and Gibbs Sampling Approaches with Applications Cambridge, MA: MIT Press [Google Scholar]
  80. Kose M, Otrok C, Whiteman C. 2003. International business cycles: world, region, and country-specific factors. Am. Econ. Rev. 93:1216–39 [Google Scholar]
  81. Lawley DN, Maxwell AE. 1971. Factor Analysis as a Statistical Method Amsterdam: Elsevier [Google Scholar]
  82. Levin A, Lin C, Chu C. 2002. Unit root tests in panel data: asymptotic and finite sample properties. J. Econom. 108:1–24 [Google Scholar]
  83. Li H, Li Q, Shi Y. 2013. Determining the number of factors when the number of factors can increase with sample size Work. Pap., Dep. Econ., Texas A & M Univ., College Station [Google Scholar]
  84. Li Y, Yu J, Zeng T. 2014. A new approach to Bayesian hypothesis testing. J. Econom. 178:602–12 [Google Scholar]
  85. Li Y, Zeng T, Yu J. 2013. Robust deviance information criterion for latent variable models Work. Pap., Singapore Manag. Univ. [Google Scholar]
  86. Litterman R. 1986. Forecasting with Bayesian vector autoregressions: five years of experience. J. Bus. Econ. Stat. 4:25–38 [Google Scholar]
  87. Lu X, Su L. 2016. Shrinkage estimation of dynamic panel data models with interactive fixed effects. J. Econom. 190:148–75 [Google Scholar]
  88. Meng JG, Hu G, Bai J. 2011. OLIVE: a simple method for estimating betas when factors are measured with error. J. Financ. Res. 34:27–60 [Google Scholar]
  89. Moench E, Ng S, Potter S. 2013. Dynamic hierarchical factor models. Rev. Econ. Stat. 95:1811–17 [Google Scholar]
  90. Moon H, Weidner M. 2015. Linear regression for panel with unknown number of factors as interactive fixed effects. Econometrica 83:1543–79 [Google Scholar]
  91. Moon H, Weidner M. 2016. Dynamic linear panel regression models with interactive fixed effects. Econ. Theory. In press [Google Scholar]
  92. Ng S, Bai J. 2009. Selecting instrumental variables in a data rich environment. J. Time Ser. Econom. 1:1–34 [Google Scholar]
  93. Ng S, Ludvigson S. 2009. Macro factors in bond risk premia. Rev. Financ. Stud. 22:5027–67 [Google Scholar]
  94. O'Connell P. 1998. The overvaluation of purchasing power parity. J. Int. Econ. 44:1–19 [Google Scholar]
  95. Onatski A. 2009. A formal statistical test for the number of factors in the approximate factor models. Econometrica 77:1447–79 [Google Scholar]
  96. Onatski A. 2010. Determining the number of factors from the empirical distribution of eigenvalues. Rev. Econ. Stat. 92:1004–16 [Google Scholar]
  97. Onatski A. 2011. Asymptotics of the principal components estimator of large factor models with weakly influential factors Work. Pap., Univ. Cambridge, Cambridge, UK [Google Scholar]
  98. Pesaran M. 2006. Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica 74:967–1012 [Google Scholar]
  99. Pesaran M. 2007. A simple panel unit root test in the presence of cross-section dependence. J. Appl. Econom. 22:265–312 [Google Scholar]
  100. Pesaran M, Schuermann T, Weiner S. 2004. Modeling regional interdependencies using a global error-correcting macroeconometric model. J. Bus. Econ. Stat. 22:129–62 [Google Scholar]
  101. Pesaran M, Smith L, Yamagata T. 2013. Panel unit root tests in the presence of a multifactor error structure. J. Econom. 175:94–115 [Google Scholar]
  102. Quah D. 1994. Exploiting cross-section variations for unit root inference in dynamic panels. Econ. Lett. 44:1–9 [Google Scholar]
  103. Quah D, Sargent T. 1993. A dynamic index model for large cross sections. Business Cycles, Indicators and Forecasting JH Stock, MW Watson 285–310 Cambridge, MA: Natl. Bur. Econ. Res. [Google Scholar]
  104. Sims C. 1993. A nine-variable probabilistic macroeconomic forecasting model. Business Cycles, Indicators and Forecasting JH Stock, MW Watson 179–204 Cambridge, MA: Natl. Bur. Econ. Res. [Google Scholar]
  105. Song M. 2013. Asymptotic theory for dynamic heterogeneous panels with cross-sectional dependence and its applications Work. Pap., Korea Inst. Finance, Seoul [Google Scholar]
  106. Stock J, Watson M. 1996. Evidence on structural instability in macroeconomic time series relations. J. Bus. Econ. Stat. 14:11–30 [Google Scholar]
  107. Stock J, Watson M. 1998. Diffusion indexes NBER Work. Pap. 6702 [Google Scholar]
  108. Stock J, Watson M. 1999. Forecasting inflation. J. Monet. Econ. 44:293–335 [Google Scholar]
  109. Stock J, Watson M. 2002a. Forecasting using principal components from a large number of predictors. J. Am. Stat. Assoc. 97:1167–79 [Google Scholar]
  110. Stock J, Watson M. 2002b. Macroeconomic forecasting using diffusion indexes. J. Bus. Econ. Stat. 20:147–62 [Google Scholar]
  111. Stock J, Watson M. 2005. Implications of dynamic factor models for VAR analysis NBER Work. Pap. 11467 [Google Scholar]
  112. Stock J, Watson M. 2006. Macroeconomic forecasting using many predictors. Handbook of Economic Forecasting 1 G Elliott, C Granger, A Timmermann 515–54 Amsterdam: North-Holland [Google Scholar]
  113. Stock J, Watson M. 2008. Forecasting in dynamic factor models subject to structural instability. The Methodology and Practice of Econometrics: A Festschrift in Honour of Professor David F. Hendry J Castle, N Shephard 1–57 New York: Oxford Univ. Press [Google Scholar]
  114. Stock J, Watson M. 2010. Dynamic factor models. Oxford Handbook of Economic Forecasting MP Clements, DF Hendry New York: Oxford Univ. Press doi: 10.1093/oxfordhb/9780195398649.013.0003 [Google Scholar]
  115. Su L, Chen Q. 2013. Testing homogeneity in panel data models with interactive fixed effects. Econom. Theory 29:1079–135 [Google Scholar]
  116. Su L, Jin S, Zhang Y. 2015. Specification test for panel data models with interactive fixed effects. J. Econom. 186:222–44 [Google Scholar]
  117. Wang P. 2012. Large dimensional factor models with a multi-level factor structure: identification, estimation, and inference Work. Pap., Hong Kong Univ. Sci. Technol. [Google Scholar]
  118. Yamamoto Y. 2016. Forecasting with non-spurious factors in U.S. macroeconomic time series. J. Bus. Econ. Stat. 34:81–106 [Google Scholar]
  119. Yamamoto Y, Tanaka S. 2015. Testing for factor loading structural change under common breaks. J. Econom. 189:187–206 [Google Scholar]
  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error