1932

Abstract

The landscape of survival analysis is constantly being revolutionized to answer biomedical challenges, most recently the statistical challenge of censored covariates rather than outcomes. There are many promising strategies to tackle censored covariates, including weighting, imputation, maximum likelihood, and Bayesian methods. Still, this is a relatively fresh area of research, different from the areas of censored outcomes (i.e., survival analysis) or missing covariates. In this review, we discuss the unique statistical challenges encountered when handling censored covariates and provide an in-depth review of existing methods designed to address those challenges. We emphasize each method's relative strengths and weaknesses, providing recommendations to help investigators pinpoint the best approach to handling censored covariates in their data.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-040522-095944
2024-04-22
2024-06-15
Loading full text...

Full text loading...

/deliver/fulltext/statistics/11/1/annurev-statistics-040522-095944.html?itemId=/content/journals/10.1146/annurev-statistics-040522-095944&mimeType=html&fmt=ahah

Literature Cited

  1. Abeyasinghe P, Long J, Razi A, Pustina D, Paulsen J, et al. 2021.. Tracking Huntington's disease progression using motor, functional, cognitive, and imaging markers. . Mov. Disord. 36:(10):228292
    [Crossref] [Google Scholar]
  2. Ahn S, Lim J, Cho Phik M, Sacco R, Elkind M. 2018.. Cox model with interval-censored covariate in cohort studies. . Biom. J. 60:(4):797814
    [Crossref] [Google Scholar]
  3. Arunajadai SG, Rauh VA. 2012.. Handling covariates subject to limits of detection in regression. . Environ. Ecol. Stat. 19:(3):36991
    [Crossref] [Google Scholar]
  4. Ashner MC, Garcia TP. 2023.. Understanding the implications of a complete case analysis for regression models with a right-censored covariate. . arXiv:2303.16119 [stat.ME]
  5. Atem F, Matsouaka R. 2017.. Linear regression model with a randomly censored predictor: estimation procedures. . Biostat. Biom. Open Access J. 1:(1):555556
    [Google Scholar]
  6. Atem F, Matsouaka R, Zimmern V. 2019.. Cox regression model with randomly censored covariates. . Biom. J. 61:(4):102032
    [Crossref] [Google Scholar]
  7. Atem F, Qian J, Maye J, Johnson K, Betensky R. 2016.. Multiple imputation of a randomly censored covariate improves logistic regression. . J. Appl. Stat. 43:(15):288696
    [Crossref] [Google Scholar]
  8. Atem F, Qian J, Maye J, Johnson K, Betensky R. 2017a.. Linear regression with a randomly censored covariate: application to an Alzheimer's study. . J. R. Stat. Soc. Ser. C 66:(2):31328
    [Crossref] [Google Scholar]
  9. Atem F, Sampene E, Greene T. 2017b.. Improved conditional imputation for linear regression with a randomly censored predictor. . Stat. Methods Med. Res. 28:(2):43244
    [Crossref] [Google Scholar]
  10. Austin P, Brunner L. 2003.. Type I error inflation in the presence of a ceiling effect. . Am. Stat. 57:(2):97104
    [Crossref] [Google Scholar]
  11. Austin P, Hoch J. 2004.. Estimating linear regression models in the presence of a censored independent variable. . Stat. Med. 23:(3):41129
    [Crossref] [Google Scholar]
  12. Bernhardt PW. 2018.. Maximum likelihood estimation in a semicontinuous survival model with covariates subject to detection limits. . Int. J. Biostat. 14:(2):20170058
    [Crossref] [Google Scholar]
  13. Bernhardt PW, Wang HJ, Zhang D. 2014.. Flexible modeling of survival data with covariates subject to detection limits via multiple imputation. . Comput. Stat. Data Anal. 69::8191
    [Crossref] [Google Scholar]
  14. Bernhardt PW, Wang HJ, Zhang D. 2015.. Statistical methods for generalized linear models with covariates subject to detection limits. . Stat. Biosci. 7:(1):6879
    [Crossref] [Google Scholar]
  15. Black AC, Harel O, McCoach DB. 2011.. Missing data techniques for multilevel data: implications of model misspecification. . J. Appl. Stat. 38:(9):184565
    [Crossref] [Google Scholar]
  16. Calle M, Gómez G. 2005.. A semiparametric hierarchical method for a regression model with an interval-censored covariate. . Aust. N. Z. J. Stat. 47:(3):35164
    [Crossref] [Google Scholar]
  17. Cole SR, Chu H, Greenland S. 2006.. Multiple-imputation for measurement-error correction. . Int. J. Epidemiol. 35:(4):107481
    [Crossref] [Google Scholar]
  18. Cole SR, Chu H, Schisterman E. 2009.. Estimating the odds ratio when exposure has a limit of detection. . Int. J. Epidemiol. 38:(6):167480
    [Crossref] [Google Scholar]
  19. Connor B. 2018.. Concise review: the use of stem cells for understanding and treating Huntington's disease. . Stem Cells 36:(2):14660
    [Crossref] [Google Scholar]
  20. D'Angelo G, Weissfeld L. 2008.. An index approach for the Cox model with left censored covariates. . Stat. Med. 27:(22):450214
    [Crossref] [Google Scholar]
  21. Dickey A, La Spada A. 2018.. Therapy development in Huntington disease: from current strategies to emerging opportunities. . Am. J. Med. Genet. A 176:(4):84261
    [Crossref] [Google Scholar]
  22. Dutra J, Garcia T, Marder K. 2020.. Huntington's disease. . In Neurological and Neuropsychiatric Epidemiology, ed. C Brayne, V Feigin, L Launer, G Logroscino , pp. 8391 Oxford, UK:: Oxford Univ. Press
    [Google Scholar]
  23. Duyao M, Ambrose C, Myers R, Novelletto A, Persichetti F, et al. 1993.. Trinucleotide repeat length instability and age of onset in Huntington's disease. . Nat. Genet. 4:(4):38792
    [Crossref] [Google Scholar]
  24. Epping E, Kim J, Craufurd D, Brashers-Krug T, Anderson K, et al. 2016.. Longitudinal psychiatric symptoms in prodromal Huntington's disease: a decade of data. . Am. J. Psychiatry 173:(2):18792
    [Crossref] [Google Scholar]
  25. Estevez-Fraga C, Flower M, Tabrizi S. 2020.. Therapeutic strategies for Huntington's disease. . Curr. Opin. Neurol. 33:(4):50818
    [Crossref] [Google Scholar]
  26. Gajewski BJ, Nicholson N, Widen JE. 2009.. Predicting hearing threshold in nonresponsive subjects using a log-normal Bayesian linear model in the presence of left-censored covariates. . Stat. Biopharm. Res. 1:(2):13748
    [Crossref] [Google Scholar]
  27. Garcia TP, Parast L. 2021.. Dynamic landmark prediction for genetic mixture models. . Biostatistics 22:(3):55874
    [Crossref] [Google Scholar]
  28. Geskus RB. 2001.. Methods for estimating the AIDS incubation time distribution when date of seroconversion is censored. . Stat. Med. 20:(5):795812
    [Crossref] [Google Scholar]
  29. Goggins WB, Finkelstein DM, Zaslavsky AM. 1999.. Applying the Cox proportional hazards model when the change time of a binary time-varying covariate is interval censored. . Biometrics 55:(2):44551
    [Crossref] [Google Scholar]
  30. Gómez G, Espinal A, Lagakos S. 2003.. Inference for a linear regression model with an interval-censored covariate. . Stat. Med. 22:(3):40925
    [Crossref] [Google Scholar]
  31. Grosser KF, Lotspeich SC, Garcia TP. 2023.. Mission imputable: correcting for Berkson error when imputing a censored covariate. . arXiv:2303.01602 [stat.ME]
  32. Hernán MA, Robins JM. 2010.. Causal Inference. Boca Raton, FL:: CRC
    [Google Scholar]
  33. Hsiao C. 1983.. Regression analysis with a categorized explanatory variable. . In Studies in Econometrics, Time Series, and Multivariate Statistics, ed. S Karlin, T Amemiya, LA Goodman , pp. 93129 New York:: Academic
    [Google Scholar]
  34. Hubeaux S, Rufibach K. 2014.. SurvRegCensCov: Weibull regression for a right-censored endpoint with a censored covariate. . arXiv:1402.0432 [stat.CO]
  35. Huntington Study Group. 1996.. Unified Huntington's Disease Rating Scale: reliability and consistency. . Mov. Disord. 11:(2):13642
    [Crossref] [Google Scholar]
  36. Huntington's Dis. Collab. Res. Group. 1993.. A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington's disease chromosomes. . Cell 72:(6):97183
    [Crossref] [Google Scholar]
  37. Kaplan E, Meier P. 1958.. Nonparametric estimation from incomplete observations. . J. Am. Stat. Assoc. 53:(282):45781
    [Crossref] [Google Scholar]
  38. Kong S, Nan B. 2016.. Semiparametric approach to regression with a covariate subject to a detection limit. . Biometrika 103:(1):16174
    [Crossref] [Google Scholar]
  39. Kong S, Nan B, Kalbfleisch JD, Saran R, Hirth R. 2018.. Conditional modeling of longitudinal data with terminal event. . J. Am. Stat. Assoc. 113:(521):35768
    [Crossref] [Google Scholar]
  40. Langohr K, Gómez G, Muga R. 2004.. A parametric survival model with an interval-censored covariate. . Stat. Med. 23:(20):315975
    [Crossref] [Google Scholar]
  41. Lee S, Lim H. 2019.. Review of statistical methods for survival analysis using genomic data. . Genom. Inform. 17:(4):e41
    [Crossref] [Google Scholar]
  42. Lee S, Park S, Park J. 2003.. The proportional hazards regression with a censored covariate. . Stat. Probab. Lett. 61:(3):30919
    [Crossref] [Google Scholar]
  43. Li Z, Tosteson TD, Bakitas MA. 2013.. Joint modeling quality of life and survival using a terminal decline model in palliative care studies. . Stat. Med. 32:(8):1394406
    [Crossref] [Google Scholar]
  44. Little RJA. 1992.. Regression with missing X's: a review. . J. Am. Stat. Assoc. 87:(420):122737
    [Google Scholar]
  45. Little RJA, Rubin DB. 2002.. Statistical Analysis with Missing Data. New York:: Wiley, 2nd ed.
    [Google Scholar]
  46. Long J, Paulsen J, Marder K, Zhang Y, Kim J, et al. 2014.. Tracking motor impairments in the progression of Huntington's disease. . Mov. Disord. 29:(3):31119
    [Crossref] [Google Scholar]
  47. Lotspeich SC, Garcia TP. 2022.. It's integral: replacing the trapezoidal rule to remove bias and correctly impute censored covariates with their conditional means. . arXiv:2209.04716 [stat.ME]
  48. Lotspeich SC, Grosser KF, Garcia TP. 2022.. Correcting conditional mean imputation for censored covariates and improving usability. . Biom. J. 64:(5):85862
    [Crossref] [Google Scholar]
  49. Lv X, Zhang R, Li Q, Li R. 2017.. Maximum weighted likelihood for discrete choice models with a dependently censored covariate. . J. Korean Stat. Soc. 46:(1):1527
    [Crossref] [Google Scholar]
  50. Lynn H. 2001.. Maximum likelihood inference for left-censored HIV RNA data. . Stat. Med. 20:(1):3345
    [Crossref] [Google Scholar]
  51. Manski CF, Tamer E. 2002.. Inference on regressions with interval data on a regressor or outcome. . Econometrica 70:(2):51946
    [Crossref] [Google Scholar]
  52. Matsouaka RA, Atem FD. 2020.. Regression with a right-censored predictor, using inverse probability weighting methods. . Stat. Med. 39:(27):400115
    [Crossref] [Google Scholar]
  53. May R, Ibrahim J, Chu H. 2011.. Maximum likelihood estimation in generalized linear models with multiple covariates subject to detection limits. . Stat. Med. 30:(20):255161
    [Crossref] [Google Scholar]
  54. Murphy SA, Van der Vaart AW. 2000.. On profile likelihood. . J. Am. Stat. Assoc. 95:(450):44965
    [Crossref] [Google Scholar]
  55. Nevo D, Hamada T, Ogino S, Wang M. 2020.. A novel calibration framework for survival analysis when a binary covariate is measured at sparse time points. . Biostatistics 21:(2):e14863
    [Crossref] [Google Scholar]
  56. Nie L, Chu H, Liu C, Cole S, Vexler A, Schisterman E. 2010.. Linear regression with an independent variable subject to a detection limit. . Epidemiology 21:(Suppl. 4):S1724
    [Crossref] [Google Scholar]
  57. Qian J, Chiou S, Maye J, Atem F, Johnson K, Betensky R. 2018.. Threshold regression to accommodate a censored covariate. . Biometrics 74:(4):126170
    [Crossref] [Google Scholar]
  58. Richardson DB, Ciampi A. 2003.. Effects of exposure measurement error when an exposure variable is constrained by a lower limit. . Am. J. Epidemiol. 157:(4):35563
    [Crossref] [Google Scholar]
  59. Rigobon R, Stoker T. 2007.. Estimation with censored regressors: basic issues. . Int. Econ. Rev. 48:(4):144167
    [Crossref] [Google Scholar]
  60. Rigobon R, Stoker T. 2009.. Bias from censored regressors. . J. Bus. Econ. Stat. 27:(3):34053
    [Crossref] [Google Scholar]
  61. Roos RAC. 2010.. Huntington's disease: a clinical review. . Orphanet J. Rare Dis. 5::40
    [Crossref] [Google Scholar]
  62. Royston P. 2007.. Multiple imputation of missing values: further update of ice, with an emphasis on interval censoring. . Stata J. 7:(4):44564
    [Crossref] [Google Scholar]
  63. Rubin DB. 1987.. Multiple Imputation for Nonresponse in Surveys. New York:: Wiley
    [Google Scholar]
  64. Sattar A, Sinha S. 2017.. Joint modeling of longitudinal and survival data with a covariate subject to a limit of detection. . Stat. Methods Med. Res. 28:(2):486502
    [Crossref] [Google Scholar]
  65. Sattar A, Sinha S, Morris N. 2012.. A parametric survival model when a covariate is subject to left-censoring. . J. Biom. Biostat. 3:(2):10.4172/2155-6180.S3-002
    [Google Scholar]
  66. Scahill R, Zeun P, Osborne-Crowley K, Johnson E, Gregory S, et al. 2020.. Biological and clinical characteristics of gene carriers far from predicted onset in the Huntington's disease Young Adult Study (HD-YAS): a cross-sectional analysis. . Lancet Neurol. 19:(6):50212
    [Crossref] [Google Scholar]
  67. Schisterman E, Vexler A, Whitcomb BW, Liu A. 2006.. The limitations due to exposure detection limits for regression models. . Am. J. Epidemiol. 164:(4):37483
    [Crossref] [Google Scholar]
  68. Schneider H, Weissfeld L. 1986.. Inference based on type II censored samples. . Biometrics 42:(3):53136
    [Crossref] [Google Scholar]
  69. Schobel S, Palermo G, Auinger P, Long J, Ma S, et al. 2017.. Motor, cognitive, and functional declines contribute to a single progressive factor in early Huntington's disease. . Neurology 89:(24):2495502
    [Crossref] [Google Scholar]
  70. Schober P, Vetter TR. 2018.. Survival analysis and interpretation of time-to-event data: the tortoise and the hare. . Anesth. Analg. 127:(3):79298
    [Crossref] [Google Scholar]
  71. Seaman SR, White IR. 2013.. Review of inverse probability weighting for dealing with missing data. . Stat. Methods Med. Res. 22:(3):27895
    [Crossref] [Google Scholar]
  72. Tsiatis A. 2006.. Semiparametric Theory and Missing Data. New York:: Springer
    [Google Scholar]
  73. Tsimikas J, Bantis L, Georgiou S. 2012.. Inference in generalized linear regression models with a censored covariate. . Comput. Stat. Data Anal. 56:(6):185468
    [Crossref] [Google Scholar]
  74. Turkson AJ, Ayiah-Mensah F, Nimoh V. 2021.. Handling censoring and censored data in survival analysis: a standalone systematic literature review. . Int. J. Math. Math. Sci. 2021::9307475
    [Crossref] [Google Scholar]
  75. van de Schoot R, Depaoli S, King R, Kramer B, Märtens K, et al. 2009.. Bayesian statistics and modelling. . Nat. Rev. Methods Primers 1::1
    [Crossref] [Google Scholar]
  76. Wang H, Feng X. 2012.. Multiple imputation for M-regression with censored covariates. . J. Am. Stat. Assoc. 107:(497):194204
    [Crossref] [Google Scholar]
  77. Wang Y, Flowers CR, Li Z, Huang X. 2022.. CondiS: a conditional survival distribution-based method for censored data imputation overcoming the hurdle in machine learning-based survival analysis. . J. Biomed. Inform. 131::104117
    [Crossref] [Google Scholar]
  78. Wei R, Wang J, Jia E, Chen T, Ni Y, Jia W. 2018.. GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies. . PLOS Comput. Biol. 14:(1):e1005973
    [Crossref] [Google Scholar]
  79. Wild E, Tabrizi S. 2017.. Therapies targeting DNA and RNA in Huntington's disease. . Lancet Neurol. 16:(12):83747
    [Crossref] [Google Scholar]
  80. Williams J, Kim J, Downing N, Farias S, Harrington D, et al. 2015.. Everyday cognition in prodromal Huntington disease. . Neuropsychology 29:(2):25567
    [Crossref] [Google Scholar]
  81. Wu H, Chen Q, Ware L, Koyoma T. 2012.. A Bayesian approach for generalized linear models with explanatory biomarker measurement variables subject to detection limit: an application to acute lung injury. . J. Appl. Stat. 39:(8):3340
    [Crossref] [Google Scholar]
  82. Wu L, Zhang H. 2018.. Mixed effects models with censored covariates, with applications in HIV/AIDS studies. . J. Probab. Stat. 2018::1581979
    [Crossref] [Google Scholar]
  83. Yucel RM, Demirtas H. 2010.. Impact of non-normal random effects on inference by multiple imputation: a simulation assessment. . Comput. Stat. Data Anal. 54:(3):790801
    [Crossref] [Google Scholar]
  84. Yue YR, Wang X. 2016.. Bayesian inference for generalized linear mixed models with predictors subject to detection limits: an approach that leverages information from auxiliary variables. . Stat. Med. 35:(10):1689705
    [Crossref] [Google Scholar]
/content/journals/10.1146/annurev-statistics-040522-095944
Loading
/content/journals/10.1146/annurev-statistics-040522-095944
Loading

Data & Media loading...

Supplemental Material

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error