1932

Abstract

Causal mediation analysis provides an attractive framework for integrating diverse types of exposure, genomic, and phenotype data. Recently, this field has seen a surge of interest, largely driven by the increasing need for causal mediation analyses in health and social sciences. This article aims to provide a review of recent developments in mediation analysis, encompassing mediation analysis of a single mediator and a large number of mediators, as well as mediation analysis with multiple exposures and mediators. Our review focuses on the recent advancements in statistical inference for causal mediation analysis, especially in the context of high-dimensional mediation analysis. We delve into the complexities of testing mediation effects, especially addressing the challenge of testing a large number of composite null hypotheses. Through extensive simulation studies, we compare the existing methods across a range of scenarios. We also include an analysis of data from the Normative Aging Study, which examines DNA methylation CpG sites as potential mediators of the effect of smoking status on lung function. We discuss the pros and cons of these methods and future research directions.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-040622-031653
2025-03-07
2025-04-22
Loading full text...

Full text loading...

/deliver/fulltext/statistics/12/1/annurev-statistics-040622-031653.html?itemId=/content/journals/10.1146/annurev-statistics-040622-031653&mimeType=html&fmt=ahah

Literature Cited

  1. Barfield R, Shen J, Just AC, Vokonas PS, Schwartz J, et al. 2017.. Testing for the indirect effect under the null for genome-wide mediation analyses. . Genet. Epidemiol. 41:(8):82433
    [Crossref] [Google Scholar]
  2. Baron RM, Kenny DA. 1986.. The moderator–mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. . J. Personal. Soc. Psychol. 51:(6):117382
    [Crossref] [Google Scholar]
  3. Bauer M, Fink B, Thürmann L, Eszlinger M, Herberth G, Lehmann I. 2016.. Tobacco smoking differently influences cell types of the innate and adaptive immune system—indications from CpG site methylation. . Clin. Epigenet. 8::83
    [Crossref] [Google Scholar]
  4. Bind MA, Vanderweele T, Coull B, Schwartz J. 2016.. Causal mediation analysis for longitudinal data with exogenous exposure. . Biostatistics 17:(1):12234
    [Crossref] [Google Scholar]
  5. Breitling LP, Yang R, Korn B, Burwinkel B, Brenner H. 2011.. Tobacco-smoking-related differential DNA methylation: 27k discovery and replication. . Am. J. Hum. Genet. 88:(4):45057
    [Crossref] [Google Scholar]
  6. Chen M, Zhou Y. 2023.. Causal mediation analysis with a three-dimensional image mediator. . arXiv:2303.06560 [stat.ME]
  7. Dai JY, Stanford JL, LeBlanc M. 2022.. A multiple-testing procedure for high-dimensional mediation hypotheses. . J. Am. Stat. Assoc. 117:(537):198213
    [Crossref] [Google Scholar]
  8. Derkach A, Moore SC, Boca SM, Sampson JN. 2020.. Group testing in mediation analysis. . Stat. Med. 39::242336
    [Crossref] [Google Scholar]
  9. Derkach A, Pfeiffer RM, Chen TH, Sampson JN. 2019.. High dimensional mediation analysis with latent variables. . Biometrics 75:(3):74556
    [Crossref] [Google Scholar]
  10. Efron B. 2004.. Large-scale simultaneous hypothesis testing: the choice of a null hypothesis. . J. Am. Stat. Assoc. 99:(465):96104
    [Crossref] [Google Scholar]
  11. Fulcher IR, Shi X, Tchetgen Tchetgen EJ. 2019.. Estimation of natural indirect effects robust to unmeasured confounding and mediator measurement error. . Epidemiology 30:(6):82534
    [Crossref] [Google Scholar]
  12. Gao X, Jia M, Zhang Y, Breitling LP, Brenner H. 2015.. DNA methylation changes of whole blood cells in response to active smoking exposure in adults: a systematic review of DNA methylation studies. . Clin. Epigenet. 7::113
    [Crossref] [Google Scholar]
  13. GTEx Consort. 2020.. The GTEx Consortium atlas of genetic regulatory effects across human tissues. . Science 369:(6509):131830
    [Crossref] [Google Scholar]
  14. Guo X, Li R, Liu J, Zeng M. 2022.. High-dimensional mediation analysis for selecting DNA methylation loci mediating childhood trauma and cortisol stress reactivity. . J. Am. Stat. Assoc. 117:(539):111021
    [Crossref] [Google Scholar]
  15. He Y, Song PX, Xu G. 2023.. Adaptive bootstrap tests for composite null hypotheses in the mediation pathway analysis. . J. R. Stat. Soc. Ser. B 86:(2):41134
    [Crossref] [Google Scholar]
  16. Herrmann M, Probst P, Hornung R, Jurinovic V, Boulesteix AL. 2021.. Large-scale benchmark study of survival prediction methods using multi-omics data. . Brief. Bioinform. 22:(3):bbaa167
    [Crossref] [Google Scholar]
  17. Huang YT. 2019.. Genome-wide analysis of sparse mediation effects under a composite null hypothesis. . Ann. Appl. Stat. 13:(1):6084
    [Crossref] [Google Scholar]
  18. Huang YT, Liang L, Moffatt MF, Cookson WO, Lin X. 2015.. iGWAS: integrative genome-wide association studies of genetic and genomic data for disease susceptibility using mediation analysis. . Genet. Epidemiol. 39:(5):34756
    [Crossref] [Google Scholar]
  19. Huang YT, Pan WC. 2016.. Hypothesis test of mediation effect in causal mediation model with high-dimensional continuous mediators. . Biometrics 72:(2):40213
    [Crossref] [Google Scholar]
  20. Huang YT, VanderWeele TJ, Lin X. 2014.. Joint analysis of SNP and gene expression data in genetic association studies of complex diseases. . Ann. Appl. Stat. 8:(1):35276
    [Crossref] [Google Scholar]
  21. Imai K, Keele L, Yamamoto T. 2010.. Identification, inference and sensitivity analysis for causal mediation effects. . Stat. Sci. 25:(1):5171
    [Crossref] [Google Scholar]
  22. Jiang T, London SJ, Lee MK, Mychaleckyj JC, Motsinger-Reif AA. 2020.. Higher criticism tuned regression for weak and sparse signals. . arXiv:2002.00130 [q-bio.QM]
  23. Jin J, Cai TT. 2007.. Estimating the null and the proportion of nonnull effects in large-scale multiple comparisons. . J. Am. Stat. Assoc. 102:(478):495506
    [Crossref] [Google Scholar]
  24. Joehanes R, Just AC, Marioni RE, Pilling LC, Reynolds LM, et al. 2016.. Epigenetic signatures of cigarette smoking. . Circ. Cardiovasc. Genet. 9:(5):43647
    [Crossref] [Google Scholar]
  25. Joubert BR, Felix JF, Yousefi P, Bakulski KM, Just AC, et al. 2016.. DNA methylation in newborns and maternal smoking in pregnancy: genome-wide consortium meta-analysis. . Am. J. Hum. Genet. 98:(4):68096
    [Crossref] [Google Scholar]
  26. Langaas M, Lindqvist BH, Ferkingstad E. 2005.. Estimating the proportion of true null hypotheses, with application to DNA microarray data. . J. R. Stat. Soc. Ser. B 67:(4):55572
    [Crossref] [Google Scholar]
  27. Li B, Leal SM. 2008.. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. . Am. J. Hum. Genet. 83:(3):31121
    [Crossref] [Google Scholar]
  28. Liu Z, Shen J, Barfield R, Schwartz J, Baccarelli AA, Lin X. 2022.. Large-scale hypothesis testing for causal mediation effects with applications in genome-wide epigenetic studies. . J. Am. Stat. Assoc. 117:(537):6781
    [Crossref] [Google Scholar]
  29. MacKinnon DP, Lockwood CM, Hoffman JM, West SG, Sheets V. 2002.. A comparison of methods to test mediation and other intervening variable effects. . Psychol. Methods 7:(1):83104
    [Crossref] [Google Scholar]
  30. Markunas CA, Xu Z, Harlid S, Wade PA, Lie RT, et al. 2014.. Identification of DNA methylation changes in newborns related to maternal smoking during pregnancy. . Environ. Health Perspect. 122:(10):114753
    [Crossref] [Google Scholar]
  31. Mattei A, Mealli F. 2011.. Augmented designs to assess principal strata direct effects. . J. R. Stat. Soc. Ser. B 73:(5):72952
    [Crossref] [Google Scholar]
  32. Miles CH, Shpitser I, Kanki P, Meloni S, Tchetgen Tchetgen EJ. 2020.. On semiparametric estimation of a path-specific effect in the presence of mediator-outcome confounding. . Biometrika 107:(1):15972
    [Google Scholar]
  33. Philibert RA, Plume JM, Gibbons FX, Brody GH, Beach SR. 2012.. The impact of recent alcohol use on genome wide DNA methylation signatures. . Front. Genet. 3::54
    [Crossref] [Google Scholar]
  34. Preacher KJ, Hayes AF. 2008.. Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. . Behav. Res. Methods 40:(3):87991
    [Crossref] [Google Scholar]
  35. Rakyan VK, Down TA, Balding DJ, Beck S. 2011.. Epigenome-wide association studies for common human diseases. . Nat. Rev. Genet. 12:(8):52941
    [Crossref] [Google Scholar]
  36. Robins JM, Greenland S. 1992.. Identifiability and exchangeability for direct and indirect effects. . Epidemiology 3:(2):14355
    [Crossref] [Google Scholar]
  37. Robins JM, Richardson T. 2010.. Alternative graphical causal models and the identification of direct effects. . In Causality and Psychopathology: Finding the Determinants of Disorders and Their Cures, ed. P Shrout, K Keyes, K Ornstein , pp. 10358. Oxford, UK:: Oxford University Press
    [Google Scholar]
  38. Rudolph KE, Williams N, Diaz I. 2021.. Causal mediation with instrumental variables. . arXiv:2112.13898 [stat.ME]
  39. Shan N, Wang Z, Hou L. 2019.. Identification of trans-eQTLs using mediation analysis with multiple mediators. . BMC Bioinformatics 20:(3):8797
    [Google Scholar]
  40. Shen J, Schwartz J, Baccarelli AA, Lin X. 2024.. Testing for the causal mediation effects of multiple mediators using the kernel machine difference method in genome-wide epigenetic studies. . Ann. Appl. Stat. 18:(1):81940
    [Crossref] [Google Scholar]
  41. Sobel ME. 1982.. Asymptotic confidence intervals for indirect effects in structural equation models. . Sociol. Methodol. 13:(1982):290312
    [Crossref] [Google Scholar]
  42. Srivastava S, Engelhardt BE, Dunson DB. 2017.. Expandable factor analysis. . Biometrika 104:(3):64963
    [Crossref] [Google Scholar]
  43. Storey JD. 2003.. The positive false discovery rate: a Bayesian interpretation and the q-value. . Ann. Stat. 31:(6):201335
    [Crossref] [Google Scholar]
  44. Storey JD, Taylor JE, Siegmund D. 2004.. Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach. . J. R. Stat. Soc. Ser. B 66:(1):187205
    [Crossref] [Google Scholar]
  45. Su D, Navas-Acien A, Hou L. 2016.. Smoking and DNA methylation in peripheral leukocytes: the multi-ethnic study of atherosclerosis. . Epigenetics 11::1092100
    [Google Scholar]
  46. Subramanian I, Verma S, Kumar S, Jere A, Anamika K. 2020.. Multi-omics data integration, interpretation, and its application. . Bioinform. Biol. Insights 14:. https://doi.org/10.1177/1177932219899051
    [Crossref] [Google Scholar]
  47. Tchetgen Tchetgen EJ, Shpitser I. 2012.. Semiparametric theory for causal mediation analysis: efficiency bounds, multiple robustness, and sensitivity analysis. . Ann. Stat. 40:(3):181645
    [Crossref] [Google Scholar]
  48. Tian P, Yao M, Huang T, Liu Z. 2022.. CoxMKF: a knockoff filter for high-dimensional mediation analysis with a survival outcome in epigenetic studies. . Bioinformatics 38:(23):522935
    [Crossref] [Google Scholar]
  49. Tsai PC, Glastonbury CA, Eliot MN, Bollepalli S, Yet I, et al. 2018.. Smoking induces coordinated DNA methylation and gene expression changes in adipose tissue with consequences for metabolic health. . Clin. Epigenet. 10::126
    [Crossref] [Google Scholar]
  50. VanderWeele TJ. 2015.. Explanation in Causal Inference: Methods for Mediation and Interaction. Oxford, UK:: Oxford Univ. Press
    [Google Scholar]
  51. VanderWeele TJ, Asomaning K, Tchetgen Tchetgen EJ, Han Y, Spitz MR, et al. 2012.. Genetic variants on 15q25.1, smoking, and lung cancer: an assessment of mediation and interaction. . Am. J. Epidemiol. 175:(10):101320
    [Crossref] [Google Scholar]
  52. Vasaikar SV, Straub P, Wang J, Zhang B. 2018.. LinkedOmics: analyzing multi-omics data within and across 32 cancer types. . Nucleic Acids Res. 46:(D1):D95663
    [Crossref] [Google Scholar]
  53. Wang W, Xu J, Schwartz J, Baccarelli A, Liu Z. 2021.. Causal mediation analysis with latent subgroups. . Stat. Med. 40:(25):562841
    [Crossref] [Google Scholar]
  54. Wickramarachchi DS, Lim LHM, Sun B. 2023.. Mediation analysis with multiple mediators under unmeasured mediator-outcome confounding. . Stat. Med. 42:(4):42232
    [Crossref] [Google Scholar]
  55. Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X. 2011.. Rare-variant association testing for sequencing data with the sequence kernel association test. . Am. J. Hum. Genet. 89:(1):8293
    [Crossref] [Google Scholar]
  56. Xia F, Chan KCG. 2023.. Identification, semiparametric efficiency, and quadruply robust estimation in mediation analysis with treatment-induced confounding. . J. Am. Stat. Assoc. 118:(542):127281
    [Crossref] [Google Scholar]
  57. Xu S, Liu L, Liu Z. 2022.. DeepMed: semiparametric causal mediation analysis with debiased deep learning. . In NIPS'22: Proceedings of the 36th International Conference on Neural Information Processing Systems, ed. S Koyejo, S Mohamed, A Agarwal, D Belgrave, K Cho, A Oh , pp. 2823851. Red Hook, NY:: Curran
    [Google Scholar]
  58. Xue F, Tang X, Kim G, Koenen KC, Martin CL, et al. 2022.. Heterogeneous mediation analysis on epigenomic PTSD and traumatic stress in a predominantly African American cohort. . J. Am. Stat. Assoc. 117:(540):166983
    [Crossref] [Google Scholar]
  59. Zhang H, Zheng Y, Hou L, Zheng C, Liu L. 2021.. Mediation analysis for survival data with high-dimensional mediators. . Bioinformatics 37:(21):381521
    [Crossref] [Google Scholar]
  60. Zhang M, Ding P. 2023.. Interpretable sensitivity analysis for the Baron-Kenny approach to mediation with unmeasured confounding. . arXiv:2205.08030 [stat.ME]
  61. Zheng C, Zhou XH. 2015.. Causal mediation analysis in the multilevel intervention and multicomponent mediator case. . J. R. Stat. Soc. Ser. B 77:(3):581615
    [Crossref] [Google Scholar]
  62. Zhong W, Darville T, Zheng X, Fine J, Li Y. 2022.. Generalized multi-SNP mediation intersection–union test. . Biometrics 78:(1):36475
    [Crossref] [Google Scholar]
  63. Zhong W, Spracklen CN, Mohlke KL, Zheng X, Fine J, Li Y. 2019.. Multi-SNP mediation intersection-union test. . Bioinformatics 35:(22):472429
    [Crossref] [Google Scholar]
/content/journals/10.1146/annurev-statistics-040622-031653
Loading
/content/journals/10.1146/annurev-statistics-040622-031653
Loading

Data & Media loading...

Supplemental Materials

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error