1932

Abstract

Dependent survival data arise in many contexts. One context is clustered survival data, where survival data are collected on clusters such as families or medical centers. Dependent survival data also arise when multiple survival times are recorded for each individual. Frailty models are one common approach to handle such data. In frailty models, the dependence is expressed in terms of a random effect, called the frailty. Frailty models have been used with both the Cox proportional hazards model and the accelerated failure time model. This article reviews recent developments in the area of frailty models in a variety of settings. In each setting we provide a detailed model description, assumptions, available estimation methods, and R packages.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-032921-021310
2023-03-09
2024-06-22
Loading full text...

Full text loading...

/deliver/fulltext/statistics/10/1/annurev-statistics-032921-021310.html?itemId=/content/journals/10.1146/annurev-statistics-032921-021310&mimeType=html&fmt=ahah

Literature Cited

  1. Aalen OO. 1988. Heterogeneity in survival analysis. Stat. Med. 7:111121–37
    [Google Scholar]
  2. Aalen OO. 1992. Modelling heterogeneity in survival analysis by the compound Poisson distribution. Ann. Appl. Probab. 2:951–72
    [Google Scholar]
  3. Andersen PK, Borgan O, Gill RD, Keiding N. 1993. Statistical Models Based on Counting Processes New York: Springer
    [Google Scholar]
  4. Andersen PK, Gill RD. 1982. Cox's regression model for counting processes: a large sample study. Ann. Stat. 10:1100–20
    [Google Scholar]
  5. Balan TA, Putter H. 2019. frailtyEM: an R package for estimating semiparametric shared frailty models. J. Stat. Softw. 90:71–29
    [Google Scholar]
  6. Balan TA, Putter H. 2020. A tutorial on frailty models. Stat. Methods Med. Res. 29:113424–54
    [Google Scholar]
  7. Bandeen-Roche K, Liang KY. 2002. Modelling multivariate failure time associations in the presence of a competing risk. Biometrika 89:2299–314
    [Google Scholar]
  8. Bedair KF, Yong Y, Al-Khalidid HR. 2021. Copula-frailty models for recurrent event data based on Monte Carlo EM algorithm. J. Stat. Comput. Simul. 91:173530–48
    [Google Scholar]
  9. Belitz C, Brezger A, Kneib T, Lang S, Umlauf N. 2017. BayesX: software for Bayesian inference in structured additive regression models. Statistical Software version 1.1
    [Google Scholar]
  10. Bender A, Groll A, Scheipl F. 2018. A generalized additive model approach to time-to-event analysis. Stat. Model. 18:3/4299–321
    [Google Scholar]
  11. Bender A, Scheipl F. 2018. pammtools: piece-wise exponential additive mixed modeling tools. arXiv:1806.01042 [stat.CO]
  12. Box-Steffensmeier JM, De Boef S. 2006. Repeated events survival models: the conditional frailty model. Stat. Med. 25:203518–33
    [Google Scholar]
  13. Breiman L. 2001. Random forests. Mach. Learn. 45:15–32
    [Google Scholar]
  14. Breiman L, Friedman JH, Olshen RA, Stone CJ. 1984. Classification and Regression Trees New York: Chapman & Hall
    [Google Scholar]
  15. Chen P, Zhang J, Zhang R. 2013. Estimation of the accelerated failure time frailty model under generalized gamma frailty. Comput. Stat. Data Anal. 62:171–80
    [Google Scholar]
  16. Clayton DG. 1978. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 65:1141–51
    [Google Scholar]
  17. Cook RJ, Lawless JF. 2007. The Statistical Analysis of Recurrent Events New York: Springer
    [Google Scholar]
  18. Cox DR. 1955. Some statistical methods connected with series of events. J. R. Stat. Soc. B 17:2129–64
    [Google Scholar]
  19. Cox DR. 1972. Regression models and life-tables. J. R. Stat. Soc. B 34:2187–202
    [Google Scholar]
  20. Cui S, Sun Y. 2004. Checking for the gamma frailty distribution under the marginal proportional hazards frailty model. Stat. Sin. 14:249–67
    [Google Scholar]
  21. Dharmarajan SH, Schaubel DE, Saran R. 2018. Evaluating center performance in the competing risks setting: application to outcomes of wait-listed end-stage renal disease patients. Biometrics 74:1289–99
    [Google Scholar]
  22. Eriksson F, Scheike T. 2015. Additive gamma frailty models with applications to competing risks in related individuals. Biometrics 71:3677–86
    [Google Scholar]
  23. Fan J, Nunn ME, Su X. 2009. Multivariate exponential survival trees and their application to tooth prognosis. Comput. Stat. Data Anal. 53:41110–21
    [Google Scholar]
  24. Fan J, Su XG, Levine RA, Nunn ME, LeBlanc M. 2006. Trees for correlated survival data by goodness of split, with applications to tooth prognosis. J. Am. Stat. Assoc. 101:475959–67
    [Google Scholar]
  25. Fleming TR, Harrington DP. 1991. Counting Processes and Survival Analysis New York: Wiley
    [Google Scholar]
  26. Gao F, Manatunga AK, Chen S 2004. Identification of prognostic factors with multivariate survival data. Comput. Stat. Data Anal. 45:4813–24
    [Google Scholar]
  27. Gao F, Manatunga AK, Chen S 2006. Developing multivariate survival trees with a proportional hazards structure. J. Data Sci. 4:3343–56
    [Google Scholar]
  28. Geerdens C, Claeskens G, Janssen P. 2013. Goodness-of-fit tests for the frailty distribution in proportional hazards models with shared frailty. Biostatistics 14:3433–46
    [Google Scholar]
  29. Gill R. 1985. Discussion of the paper by D. Clayton and J. Cuzick.. J. R. Stat. Soc. A 148:108–9
    [Google Scholar]
  30. Glidden DV. 1999. Checking the adequacy of the gamma frailty model for multivariate failure times. Biometrika 86:2381–93
    [Google Scholar]
  31. Glidden DV, Vittinghoff E. 2004. Modelling clustered survival data from multicentre clinical trials. Stat. Med. 23:3369–88
    [Google Scholar]
  32. Goethals K, Janssen P, Duchateau L. 2008. Frailty models and copulas: similarities and differences. J. Appl. Stat. 35:91071–79
    [Google Scholar]
  33. Gordon L, Olshen RA. 1985. Tree-structured survival analysis. Cancer Treat. Rep. 69:101065–69
    [Google Scholar]
  34. Gorfine M, De-Picciotto R, Hsu L. 2012. Conditional and marginal estimates in case-control family data–extensions and sensitivity analyses. J. Stat. Comput. Simul. 82:101449–70
    [Google Scholar]
  35. Gorfine M, Hsu L. 2011. Frailty-based competing risks model for multivariate survival data. Biometrics 67:2415–26
    [Google Scholar]
  36. Gorfine M, Keret N, Ben Arie A, Zucker D, Hsu L. 2021. Marginalized frailty-based illness-death model: application to the UK-Biobank survival data. J. Am. Stat. Assoc. 116:1155–67
    [Google Scholar]
  37. Gorfine M, Zucker DM, Hsu L. 2006. Prospective survival analysis with a general semiparametric shared frailty model: a pseudo full likelihood approach. Biometrika 93:3735–41
    [Google Scholar]
  38. Groll A, Hastie T, Tutz G. 2017. Selection of effects in Cox frailty models by regularization methods. Biometrics 73:3846–56
    [Google Scholar]
  39. Gu C. 2014. Smoothing spline ANOVA models: R package gss. J. Stat. Softw. 58:51–25
    [Google Scholar]
  40. Ha ID, Jeong JH, Lee Y. 2017. Statistical Modelling of Survival Data with Random Effects Singapore: Springer
    [Google Scholar]
  41. Ha ID, Lee Y, Song JK. 2001. Hierarchical likelihood approach for frailty models. Biometrika 88:1233–33
    [Google Scholar]
  42. Ha ID, Noh M, Kim J, Lee Y 2019. frailtyhl: frailty models via hierarchical likelihood. R Package version 2.3
    [Google Scholar]
  43. Hallett M, Fan J, Su X, Levine R, Nunn M. 2014. Random forest and variable importance rankings for correlated survival data, with applications to tooth loss. Stat. Model. 14:6523–47
    [Google Scholar]
  44. Haneuse S, Lee KH. 2016. Semi-competing risks data analysis: accounting for death as a competing risk when the outcome of interest is nonterminal. Circ. Cardiovasc. Q. Outcomes 9:3322–31
    [Google Scholar]
  45. Hothorn T, Hornik K, Zeileis A. 2006. Unbiased recursive partitioning: a conditional inference framework. J. Comput. Graph. Stat. 15:3651–74
    [Google Scholar]
  46. Hougaard P. 1986a. A class of multivariate failure time distributions. Biometrika 73:3671–78
    [Google Scholar]
  47. Hougaard P. 1986b. Survival models for heterogeneous populations derived from stable distributions. Biometrika 73:2387–96
    [Google Scholar]
  48. Hougaard P. 1999. Fundamentals of survival data. Biometrics 55:113–22
    [Google Scholar]
  49. Hougaard P. 2000. Analysis of Multivariate Survival Data New York: Springer
    [Google Scholar]
  50. Hougaard P, Myglegaard P, Borch-Johnsen K. 1994. Heterogeneity models of disease susceptibility, with application to diabetic nephropathy. Biometrics 50:1178–88
    [Google Scholar]
  51. Hsu L, Gorfine M, Malone K. 2007. Effect of frailty distribution misspecification on marginal regression estimates and hazard functions in multivariate survival analysis. Stat. Med. 26:4657–78
    [Google Scholar]
  52. Huang X, Wolfe RA. 2002. A frailty model for informative censoring. Biometrics 58:3510–20
    [Google Scholar]
  53. Huang X, Wolfe RA, Hu C. 2004. A test for informative censoring in clustered survival data. Stat. Med. 23:132089–107
    [Google Scholar]
  54. Ishwaran H, Gerds TA, Kogalur UB, Moore RD, Gange SJ, Lau BM. 2014. Random survival forests for competing risks. Biostatistics 15:4757–73
    [Google Scholar]
  55. Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS. 2008. Random survival forests. Ann. Appl. Stat. 2:3841–60
    [Google Scholar]
  56. Ishwaran H, Lu M. 2019. Standard errors and confidence intervals for variable importance in random forest regression, classification, and survival. Stat. Med. 38:4558–82
    [Google Scholar]
  57. Jeon J, Hsu L, Gorfine M. 2012. Bias correction in the hierarchical likelihood approach to the analysis of multivariate survival data. Biostatistics 13:3384–97
    [Google Scholar]
  58. Jiang F, Haneuse S. 2017. A semi-parametric transformation frailty model for semi-competing risks survival data. Scand. J. Stat. 44:1112–29
    [Google Scholar]
  59. Johnson LM, Strawderman RL. 2012. A smoothing expectation and substitution algorithm for the semiparametric accelerated failure time frailty model. Stat. Med. 31:212335–58
    [Google Scholar]
  60. Kalbfleisch JD, Prentice RL. 2002. The Statistical Analysis of Failure Time Data New York: Wiley. , 2nd ed..
    [Google Scholar]
  61. Kats L, Gorfine M. 2022. An accelerated failure time regression model for illness-death data: a frailty approach. arXiv:2205.03954 [stat.ME]
  62. Keiding N, Andersen PK, Klein JP. 1997. The role of frailty models and accelerated failure time models in describing heterogeneity due to omitted covariates. Stat. Med. 16:2215–24
    [Google Scholar]
  63. Kendall MG. 1938. A new measure of rank correlation. Biometrika 30:1/281–93
    [Google Scholar]
  64. Klein JP. 1992. Semiparametric estimation of random effects using the Cox model based on the EM algorithm. Biometrics 48:3795–806
    [Google Scholar]
  65. Klein JP, Moeschberger ML. 2003. Survival Analysis: Techniques for Censored and Truncated Data New York: Springer
    [Google Scholar]
  66. Lambert P, Collett D, Kimber A, Johnson R 2004. Parametric accelerated failure time models with random effects and an application to kidney transplant survival. Stat. Med. 23:203177–92
    [Google Scholar]
  67. Lee C, Gilsanz P, Haneuse S. 2021. Fitting a shared frailty illness-death model to left-truncated semi-competing risks data to examine the impact of education level on incident dementia. BMC Med. Res. Methodol. 21:18
    [Google Scholar]
  68. Lee KH, Haneuse S, Schrag D, Dominici F. 2015. Bayesian semiparametric analysis of semi-competing risks data: investigating hospital readmission after a pancreatic cancer diagnosis. J. R. Stat. Soc. C 64:2253–73
    [Google Scholar]
  69. Lee KH, Rondeau V, Haneuse S. 2017. Accelerated failure time models for semi-competing risks data in the presence of complex censoring. Biometrics 73:41401–12
    [Google Scholar]
  70. Lee Y, Nelder J. 1996. Hierarchical generalized linear models (with discussion). J. R. Stat. Soc. B 58:619–78
    [Google Scholar]
  71. Levine RA, Fan J, Su X, Nunn ME. 2014. Bayesian survival trees for clustered observations, applied to tooth prognosis. Stat. Anal. Data Min. 7:2111–24
    [Google Scholar]
  72. Li L, Wu T, Feng C. 2021. Model diagnostics for censored regression via randomized survival probabilities. Stat. Med. 40:61482–97
    [Google Scholar]
  73. Liu B, Lu W, Zhang J. 2013. Kernel smoothed profile likelihood estimation in the accelerated failure time frailty model for clustered survival data. Biometrika 100:3741–55
    [Google Scholar]
  74. Liu L, Huang X, Yaroshinsky A, Cormier JN. 2016. Joint frailty models for zero-inflated recurrent events in the presence of a terminal event. Biometrics 72:1204–14
    [Google Scholar]
  75. Ma S, Kosorok MR. 2005. Robust semiparametric M-estimation and the weighted bootstrap. J. Multivar. Anal. 96:1190–217
    [Google Scholar]
  76. Mazroui Y, Mathoulin-Pelissier S, Soubeyran P, Rondeau V. 2012. General joint frailty model for recurrent event data with a dependent terminal event: application to follicular lymphoma data. Stat. Med. 31:11/121162–76
    [Google Scholar]
  77. McGilchrist C, Aisbett C. 1991. Regression with frailty in survival analysis. Biometrics 47:2461–66
    [Google Scholar]
  78. Monaco JV, Gorfine M, Hsu L. 2018. General semiparametric shared frailty model: estimation and simulation with frailtySurv. J. Stat. Softw. 86:41–42
    [Google Scholar]
  79. Munda M, Legrand C. 2013. A diagnostic plot for guiding the choice of the frailty distribution in clustered survival data. Discuss. Pap. 2013/37, Inst. Stat., Biostat. Actuar. Sci. (ISBA), Univ. Cathol. Louvain, Ottignies-Louvain-la-Neuve, Belg .
    [Google Scholar]
  80. Munda M, Rotolo F, Legrand C. 2012. parfm: parametric frailty models in R. J. Stat. Softw. 51:111–20
    [Google Scholar]
  81. Nevo D, Gorfine M. 2022. Causal inference for semi-competing risks data. Biostatistics 23:41115–32
    [Google Scholar]
  82. Nielsen GG, Gill RD, Andersen PK, Sørensen TI. 1992. A counting process approach to maximum likelihood estimation in frailty models. Scand. J. Stat. 19:25–43
    [Google Scholar]
  83. Pan W. 2001. Using frailties in the accelerated failure time model. Lifetime Data Anal. 7:155–64
    [Google Scholar]
  84. Parner E. 1998. Asymptotic theory for the correlated gamma-frailty model. Ann. Stat. 26:1183–214
    [Google Scholar]
  85. Peng Y, Yu B 2021. Cure Models: Methods, Applications, and Implementation Boca Raton, FL: Chapman & Hall/CRC
    [Google Scholar]
  86. R Core Team 2020. R: a language and environment for statistical computing. Statistical Software R Found. Stat. Comput. Vienna:
    [Google Scholar]
  87. Rakhmawati TW, Ha ID, Lee H, Lee Y. 2021. Penalized variable selection for cause-specific hazard frailty models with clustered competing-risks data. Stat. Med. 40:296541–57
    [Google Scholar]
  88. Ramjith J, Bender A, Roes KCB, Jonker MA. 2022. Recurrent events. Pammtools: Articles https://adibender.github.io/pammtools/articles/recurrent-events.html
    [Google Scholar]
  89. Ripatti S, Palmgren J. 2000. Estimation of multivariate frailty models using penalized partial likelihood. Biometrics 56:41016–22
    [Google Scholar]
  90. Risch HA, McLaughlin JR, Cole DE, Rosen B, Bradley L et al. 2006. Population BRCA1 and BRCA2 mutation frequencies and cancer penetrances: a kin–cohort study in Ontario, Canada. J. Natl. Cancer Inst. 98:231694–706
    [Google Scholar]
  91. Rondeau V, Gonzalez JR, Mazroui Y, Mauguen A, Diakite A et al. 2019. frailtypack: general frailty models: shared, joint and nested frailty models with prediction; evaluation of failure-time surrogate endpoints. R Package version 3.0.3
    [Google Scholar]
  92. Rueten-Budde AJ, Putter H, Fiocco M. 2019. Investigating hospital heterogeneity with a competing risks frailty model. Stat. Med. 38:2269–88
    [Google Scholar]
  93. Segal MR. 1988. Regression trees for censored data. Biometrics 44:135–47
    [Google Scholar]
  94. Shih JH, Louis TA. 1995. Assessing gamma frailty models for clustered failure time data. Lifetime Data Anal. 1:2205–20
    [Google Scholar]
  95. Struewing JP, Hartge P, Wacholder S, Baker SM, Berlin M et al. 1997. The risk of cancer associated with specific mutations of BRCA1 and BRCA2 among Ashkenazi Jews. N. Engl. J. Med. 336:201401–8
    [Google Scholar]
  96. Su X, Fan J 2001. Multivariate survival trees by goodness of split. Tech. Rep. 367 Dep. Stat., Univ. Calif. Davis:
    [Google Scholar]
  97. Su X, Fan J 2004. Multivariate survival trees: a maximum likelihood approach based on frailty models. Biometrics 60:193–99
    [Google Scholar]
  98. Therneau TM. 2020. coxme: mixed effects Cox models. R Package version 2.2-16
    [Google Scholar]
  99. Therneau TM. 2021. survival: a package for survival analysis in R. R Package version 3.2-13. https://CRAN.R-project.org/package=survival
    [Google Scholar]
  100. Therneau TM, Grambsch PM, Pankratz VS. 2003. Penalized survival models and frailty. J. Comput. Graph. Stat. 12:1156–75
    [Google Scholar]
  101. Tsiatis AA. 1990. Estimating regression parameters using linear rank tests for censored data. Ann. Stat. 18:354–72
    [Google Scholar]
  102. Wang MC, Qin J, Chiang CT. 2001. Analyzing recurrent event data with informative censoring. J. Am. Stat. Assoc. 96:31057–65
    [Google Scholar]
  103. Wang P, Li Y, Reddy CK. 2019. Machine learning for survival analysis: a survey. ACM Comput. Surv. 51:6110
    [Google Scholar]
  104. Wang W, Fu H, Yan J 2021. reda: recurrent event data analysis. R Package version 0.5.3
    [Google Scholar]
  105. Wei LJ, Lin DY, Weissfeld L. 1989. Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J. Am. Stat. Assoc. 84:4081065–73
    [Google Scholar]
  106. Wood SN. 2011. Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J. R. Stat. Soc. B 73:13–36
    [Google Scholar]
  107. Xu J, Kalbfleisch JD, Tai B. 2010. Statistical analysis of illness–death processes and semicompeting risks data. Biometrics 66:3716–25
    [Google Scholar]
  108. Xu L, Zhang J. 2010. An EM-like algorithm for the semiparametric accelerated failure time gamma frailty model. Comput. Stat. Data Anal. 54:61467–74
    [Google Scholar]
  109. Zeng D, Lin D. 2007a. Efficient estimation for the accelerated failure time model. J. Am. Stat. Assoc. 102:4801387–96
    [Google Scholar]
  110. Zeng D, Lin D. 2007b. Maximum likelihood estimation in semiparametric regression models with censored data. J. R. Stat. Soc. B 69:4507–64
    [Google Scholar]
  111. Zhang J, Peng Y. 2007. An alternative estimation method for the accelerated failure time frailty model. Comput. Stat. Data Anal. 51:94413–23
    [Google Scholar]
  112. Zhou H, Hanson T, Zhang J. 2020. spBayesSurv: fitting Bayesian spatial survival models using R. J. Stat. Softw. 92:91–33
    [Google Scholar]
  113. Zhu R, Kosorok MR. 2012. Recursively imputed survival trees. J. Am. Stat. Assoc. 107:497331–40
    [Google Scholar]
  114. Zucker DM, Gorfine M, Hsu L. 2008. Pseudo-full likelihood estimation for prospective survival analysis with a general semiparametric shared frailty model: asymptotic theory. J. Stat. Plan. Inference 138:71998–2016
    [Google Scholar]
/content/journals/10.1146/annurev-statistics-032921-021310
Loading
/content/journals/10.1146/annurev-statistics-032921-021310
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error