1932

Abstract

The integration of multiomics data with detailed phenotypic insights from electronic health records marks a paradigm shift in biomedical research, offering unparalleled holistic views into health and disease pathways. This review delineates the current landscape of multimodal omics data integration, emphasizing its transformative potential in generating a comprehensive understanding of complex biological systems. We explore robust methodologies for data integration, ranging from concatenation-based to transformation-based and network-based strategies, designed to harness the intricate nuances of diverse data types. Our discussion extends from incorporating large-scale population biobanks to dissecting high-dimensional omics layers at the single-cell level. The review underscores the emerging role of large language models in artificial intelligence, anticipating their influence as a near-future pivot in data integration approaches. Highlighting both achievements and hurdles, we advocate for a concerted effort toward sophisticated integration models, fortifying the foundation for groundbreaking discoveries in precision medicine.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-biodatasci-102523-103801
2024-08-23
2025-02-10
Loading full text...

Full text loading...

/deliver/fulltext/biodatasci/7/1/annurev-biodatasci-102523-103801.html?itemId=/content/journals/10.1146/annurev-biodatasci-102523-103801&mimeType=html&fmt=ahah

Literature Cited

  1. 1.
    Reuter JA, Spacek DV, Snyder MP. 2015.. High-throughput sequencing technologies. . Mol. Cell 58::58697
    [Crossref] [Google Scholar]
  2. 2.
    Karczewski KJ, Snyder MP. 2018.. Integrative omics for health and disease. . Nat. Rev. Genet. 19::299310
    [Crossref] [Google Scholar]
  3. 3.
    Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, et al. 2001.. Initial sequencing and analysis of the human genome. . Nature 409::860921
    [Crossref] [Google Scholar]
  4. 4.
    Abecasis GR, Altshuler D, Auton A, Brooks LD, Durbin RM, et al. 2010.. A map of human genome variation from population-scale sequencing. . Nature 467::106173
    [Crossref] [Google Scholar]
  5. 5.
    Karczewski KJ, Francioli LC. 2020.. The mutational constraint spectrum quantified from variation in 141,456 humans. . Nature 581::43443
    [Crossref] [Google Scholar]
  6. 6.
    Snyder MP, Gingeras TR, Moore JE. 2020.. Perspectives on ENCODE. . Nature 583:69398
    [Google Scholar]
  7. 7.
    Bernstein BE, Stamatoyannopoulos JA, Costello JF, Ren B, Milosavljevic A, et al. 2010.. The NIH Roadmap Epigenomics Mapping Consortium. . Nat. Biotechnol. 28::104548
    [Crossref] [Google Scholar]
  8. 8.
    Stunnenberg HG, Hirst M. 2016.. The International Human Epigenome Consortium: a blueprint for scientific collaboration and discovery. . Cell 167::114549
    [Crossref] [Google Scholar]
  9. 9.
    Bae JB. 2013.. Perspectives of International Human Epigenome Consortium. . Genom. Inform. 11::714
    [Crossref] [Google Scholar]
  10. 10.
    Battle A, Brown CD, Engelhardt BE, Montgomery SB. 2017.. Genetic effects on gene expression across human tissues. . Nature 550::20413
    [Crossref] [Google Scholar]
  11. 11.
    Lonsdale J, Thomas J, Salvatore M, Phillips R, Lo E, et al. 2013.. The Genotype-Tissue Expression (GTEx) project. . Nat. Genet. 45::58085
    [Crossref] [Google Scholar]
  12. 12.
    Võsa U, Claringbould A, Westra H-J, Bonder MJ, Deelen P, et al. 2021.. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. . Nat. Genet. 53::130010
    [Crossref] [Google Scholar]
  13. 13.
    Lizio M, Harshbarger J, Shimoji H, Severin J, Kasukawa T, et al. 2015.. Gateways to the FANTOM5 promoter level mammalian expression atlas. . Genome Biol. 16::22
    [Crossref] [Google Scholar]
  14. 14.
    Abugessaisa I, Ramilowski JA, Lizio M, Severin J, Hasegawa A, et al. 2021.. FANTOM enters 20th year: expansion of transcriptomic atlases and functional annotation of non-coding RNAs. . Nucleic Acids Res. 49::D89298
    [Crossref] [Google Scholar]
  15. 15.
    Alam T, Agrawal S, Severin J, Young RS, Andersson R, et al. 2020.. Comparative transcriptomics of primary cells in vertebrates. . Genome Res. 30::95161
    [Crossref] [Google Scholar]
  16. 16.
    Lappalainen T, Sammeth M, Friedländer MR, ’t Hoen PAC, Monlong J, et al. 2013.. Transcriptome and genome sequencing uncovers functional variation in humans. . Nature 501::50611
    [Crossref] [Google Scholar]
  17. 17.
    ’t Hoen PAC, Friedländer MR, Almlöf J, Sammeth M, Pulyakhina I, et al. 2013.. Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories. . Nat. Biotechnol. 31::101522
    [Crossref] [Google Scholar]
  18. 18.
    Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M, et al. 2012.. GENCODE: the reference human genome annotation for The ENCODE Project. . Genome Res. 22::176074
    [Crossref] [Google Scholar]
  19. 19.
    Legrain P, Aebersold R, Archakov A, Bairoch A, Bala K, et al. 2011.. The Human Proteome Project: current state and future direction. . Mol. Cell Proteom. 10::M111.009993
    [Crossref] [Google Scholar]
  20. 20.
    Edwards NJ, Oberti M, Thangudu RR, Cai S, McGarvey PB, et al. 2015.. The CPTAC Data Portal: a resource for cancer proteomics research. . J. Proteome Res. 14::270713
    [Crossref] [Google Scholar]
  21. 21.
    Whiteaker JR, Halusa GN, Hoofnagle AN, Sharma V, MacLean B, et al. 2014.. CPTAC Assay Portal: a repository of targeted proteomic assays. . Nat. Methods 11::7034
    [Crossref] [Google Scholar]
  22. 22.
    Tuck MK, Chan DW, Chia D, Godwin AK, Grizzle WE, et al. 2009.. Standard operating procedures for serum and plasma collection: early detection research network consensus statement standard operating procedure integration working group. . J. Proteome Res. 8::11317
    [Crossref] [Google Scholar]
  23. 23.
    Pietzner M, Wheeler E, Carrasco-Zanini J, Cortes A, Koprulu M, et al. 2021.. Mapping the proteo-genomic convergence of human diseases. . Science 374::eabj1541
    [Crossref] [Google Scholar]
  24. 24.
    Day N, Oakes S, Luben R, Khaw KT, Bingham S, et al. 1999.. EPIC-Norfolk: study design and characteristics of the cohort. European Prospective Investigation of Cancer. . Br. J. Cancer 80:(Suppl. 1):95103
    [Google Scholar]
  25. 25.
    Koprulu M, Carrasco-Zanini J, Wheeler E, Lockhart S, Kerrison ND, et al. 2023.. Proteogenomic links to human metabolic diseases. . Nat. Metab. 5::51628
    [Crossref] [Google Scholar]
  26. 26.
    Wishart DS, Guo A, Oler E, Wang F, Anjum A, et al. 2022.. HMDB 5.0: the Human Metabolome Database for 2022. . Nucleic Acids Res. 50::D62231
    [Crossref] [Google Scholar]
  27. 27.
    Carayol M, Leitzmann MF, Ferrari P, Zamora-Ros R, Achaintre D, et al. 2017.. Blood metabolic signatures of body mass index: a targeted metabolomics study in the EPIC cohort. . J. Proteome Res. 16::313746
    [Crossref] [Google Scholar]
  28. 28.
    Turnbaugh PJ, Ley RE, Hamady M, Fraser-Liggett CM, Knight R, Gordon JI. 2007.. The Human Microbiome Project. . Nature 449::80410
    [Crossref] [Google Scholar]
  29. 29.
    Proctor LM, Creasy HH, Fettweis JM, Lloyd-Price J, Mahurkar A, et al. 2019.. The Integrative Human Microbiome Project. . Nature 569::64148
    [Crossref] [Google Scholar]
  30. 30.
    Lopera-Maya EA, Kurilshikov A, van der Graaf A, Hu S, Andreu-Sánchez S, et al. 2022.. Effect of host genetics on the gut microbiome in 7,738 participants of the Dutch Microbiome Project. . Nat. Genet. 54::14351
    [Crossref] [Google Scholar]
  31. 31.
    Reel PS, Reel S, van Kralingen JC, Langton K, Lang K, et al. 2022.. Machine learning for classification of hypertension subtypes using multi-omics: a multi-centre, retrospective, data-driven study. . EBioMedicine 84::104276
    [Crossref] [Google Scholar]
  32. 32.
    Guo L, Zhong MB, Zhang L, Zhang B, Cai D. 2022.. Sex differences in Alzheimer's disease: insights from the multiomics landscape. . Biol. Psychiatry 91::6171
    [Crossref] [Google Scholar]
  33. 33.
    Maitre L, Bustamante M, Hernández-Ferrer C, Thiel D, Lau CE, et al. 2022.. Multi-omics signatures of the human early life exposome. . Nat. Commun. 13::7024
    [Crossref] [Google Scholar]
  34. 34.
    Watanabe K, Wilmanski T, Diener C, Earls JC, Zimmer A, et al. 2023.. Multiomic signatures of body mass index identify heterogeneous health phenotypes and responses to a lifestyle intervention. . Nat. Med. 29::9961008
    [Crossref] [Google Scholar]
  35. 35.
    Beesley LJ, Salvatore M, Fritsche LG, Pandit A, Rao A, et al. 2020.. The emerging landscape of health research based on biobanks linked to electronic health records: existing resources, statistical challenges, and potential opportunities. . Stat. Med. 39::773800
    [Crossref] [Google Scholar]
  36. 36.
    Guo LY, Wu AH, Wang YX, Zhang LP, Chai H, Liang XF. 2020.. Deep learning-based ovarian cancer subtypes identification using multi-omics data. . BioData Min. 13::10
    [Crossref] [Google Scholar]
  37. 37.
    Qiao J, Wu Y, Zhang S, Xu Y, Zhang J, et al. 2023.. Evaluating significance of European-associated index SNPs in the East Asian population for 31 complex phenotypes. . BMC Genom. 24::324
    [Crossref] [Google Scholar]
  38. 38.
    Chatsirisupachai K, Lesluyes T, Paraoan L, Van Loo P, de Magalhães JP. 2021.. An integrative analysis of the age-associated multi-omic landscape across cancers. . Nat. Commun. 12::2345
    [Crossref] [Google Scholar]
  39. 39.
    Marabita F, James T, Karhu A, Virtanen H, Kettunen K, et al. 2022.. Multiomics and digital monitoring during lifestyle changes reveal independent dimensions of human biology and health. . Cell Syst. 13::24155.e7
    [Crossref] [Google Scholar]
  40. 40.
    Perakakis N, Yazdani A, Karniadakis GE, Mantzoros C. 2018.. Omics, big data and machine learning as tools to propel understanding of biological mechanisms and to discover novel diagnostics and therapeutics. . Metabolism 87::A19
    [Crossref] [Google Scholar]
  41. 41.
    Jiang MZ, Aguet F, Ardlie K, Chen J, Cornell E, et al. 2023.. Canonical correlation analysis for multi-omics: application to cross-cohort analysis. . PLOS Genet. 19::e1010517
    [Crossref] [Google Scholar]
  42. 42.
    Zhao H, Rasheed H, Nøst TH, Cho Y, Liu Y, et al. 2022.. Proteome-wide Mendelian randomization in global biobank meta-analysis reveals multi-ancestry drug targets for common diseases. . Cell Genom. 2::100195
    [Crossref] [Google Scholar]
  43. 43.
    Chen R, Mias GI, Li-Pook-Than J, Jiang L, Lam HY, et al. 2012.. Personal omics profiling reveals dynamic molecular and medical phenotypes. . Cell 148::1293307
    [Crossref] [Google Scholar]
  44. 44.
    Li-Pook-Than J, Snyder M. 2013.. iPOP goes the world: integrated personalized omics profiling and the road toward improved health care. . Chem. Biol. 20::66066
    [Crossref] [Google Scholar]
  45. 45.
    Sanna S, van Zuydam NR, Mahajan A, Kurilshikov A, Vich Vila A, et al. 2019.. Causal relationships among the gut microbiome, short-chain fatty acids and metabolic diseases. . Nat. Genet. 51::6005
    [Crossref] [Google Scholar]
  46. 46.
    Sveinbjornsson G, Ulfarsson MO, Thorolfsdottir RB, Jonsson BA, Einarsson E, et al. 2022.. Multiomics study of nonalcoholic fatty liver disease. . Nat. Genet. 54::165263
    [Crossref] [Google Scholar]
  47. 47.
    Ritchie SC, Surendran P, Karthikeyan S, Lambert SA, Bolton T, et al. 2023.. Quality control and removal of technical variation of NMR metabolic biomarker data in ∼120,000 UK Biobank participants. . Sci. Data 10::64
    [Crossref] [Google Scholar]
  48. 48.
    Di Angelantonio E, Thompson SG, Kaptoge S, Moore C, Walker M, et al. 2017.. Efficiency and safety of varying the frequency of whole blood donation (INTERVAL): a randomised trial of 45 000 donors. . Lancet 390::236071
    [Crossref] [Google Scholar]
  49. 49.
    Xu Y, Ritchie SC, Liang Y, Timmers P, Pietzner M, et al. 2023.. An atlas of genetic scores to predict multi-omic traits. . Nature 616::12331
    [Crossref] [Google Scholar]
  50. 50.
    Surendran P, Stewart ID, Au Yeung VPW, Pietzner M, Raffler J, et al. 2022.. Rare and common genetic determinants of metabolic individuality and their effects on human health. . Nat. Med. 28::232132
    [Crossref] [Google Scholar]
  51. 51.
    Zhernakova DV, Deelen P, Vermaat M, van Iterson M, van Galen M, et al. 2017.. Identification of context-dependent expression quantitative trait loci in whole blood. . Nat. Genet. 49::13945
    [Crossref] [Google Scholar]
  52. 52.
    Bonder MJ, Luijk R, Zhernakova DV, Moed M, Deelen P, et al. 2017.. Disease variants alter transcription factor levels and methylation of their binding sites. . Nat. Genet. 49::13138
    [Crossref] [Google Scholar]
  53. 53.
    Niehues A, Bizzarri D, Reinders MJT, Slagboom PE, van Gool AJ, et al. 2022.. Metabolomic predictors of phenotypic traits can replace and complement measured clinical variables in population-scale expression profiling studies. . BMC Genom. 23::546
    [Crossref] [Google Scholar]
  54. 54.
    Sijtsma A, Rienks J, van der Harst P, Navis G, Rosmalen JGM, Dotinga A. 2022.. Cohort profile update: lifelines, a three-generation cohort study and biobank. . Int. J. Epidemiol. 51::e295302
    [Crossref] [Google Scholar]
  55. 55.
    Bonder MJ, Kurilshikov A, Tigchelaar EF, Mujagic Z, Imhann F, et al. 2016.. The effect of host genetics on the gut microbiome. . Nat. Genet. 48::140712
    [Crossref] [Google Scholar]
  56. 56.
    Leitsalu L, Haller T, Esko T, Tammesoo ML, Alavere H, et al. 2015.. Cohort profile: Estonian Biobank of the Estonian Genome Center, University of Tartu. . Int. J. Epidemiol. 44::113747
    [Crossref] [Google Scholar]
  57. 57.
    Aasmets O, Krigul KL, Lüll K, Metspalu A, Org E. 2022.. Gut metagenome associations with extensive digital health data in a volunteer-based Estonian microbiome cohort. . Nat. Commun. 13::869
    [Crossref] [Google Scholar]
  58. 58.
    Chen Z, Chen J, Collins R, Guo Y, Peto R, et al. 2011.. China Kadoorie Biobank of 0.5 million people: survey methods, baseline characteristics and long-term follow-up. . Int. J. Epidemiol. 40::165266
    [Crossref] [Google Scholar]
  59. 59.
    Xu F, Yu EY, Cai X, Yue L, Jing LP, et al. 2023.. Genome-wide genotype-serum proteome mapping provides insights into the cross-ancestry differences in cardiometabolic disease susceptibility. . Nat. Commun. 14::896
    [Crossref] [Google Scholar]
  60. 60.
    Feng YA, Chen CY, Chen TT, Kuo PH, Hsu YH, et al. 2022.. Taiwan Biobank: a rich biomedical research database of the Taiwanese population. . Cell Genom. 2::100197
    [Crossref] [Google Scholar]
  61. 61.
    Nagai A, Hirata M, Kamatani Y, Muto K, Matsuda K, et al. 2017.. Overview of the BioBank Japan Project: study design and profile. . J. Epidemiol. 27::S28
    [Crossref] [Google Scholar]
  62. 62.
    Tadaka S, Hishinuma E, Komaki S, Motoike IN, Kawashima J, et al. 2021.. jMorp updates in 2020: large enhancement of multi-omics data resources on the general Japanese population. . Nucleic Acids Res. 49::D53644
    [Crossref] [Google Scholar]
  63. 63.
    Kuriyama S, Yaegashi N, Nagami F, Arai T, Kawaguchi Y, et al. 2016.. The Tohoku Medical Megabank Project: design and mission. . J. Epidemiol. 26::493511
    [Crossref] [Google Scholar]
  64. 64.
    Kim Y, Han BG. 2017.. Cohort profile: the Korean genome and epidemiology study (KoGES) consortium. . Int. J. Epidemiol. 46::e20. Erratum. 2017.. Int. J. Epidemiol. 46::1350
    [Google Scholar]
  65. 65.
    Hahn SJ, Kim S, Choi YS, Lee J, Kang J. 2022.. Prediction of type 2 diabetes using genome-wide polygenic risk score and metabolic profiles: a machine learning analysis of population-based 10-year prospective cohort study. . EBioMedicine 86::104383
    [Crossref] [Google Scholar]
  66. 66.
    Jang HB, Hwang JY, Park JE, Oh JH, Ahn Y, et al. 2014.. Intake levels of dietary polyunsaturated fatty acids modify the association between the genetic variation in PCSK5 and HDL cholesterol. . J. Med. Genet. 51::78288
    [Crossref] [Google Scholar]
  67. 67.
    Lee W, Lee HJ, Jang HB, Kim HJ, Ban HJ, et al. 2018.. Asymmetric dimethylarginine (ADMA) is identified as a potential biomarker of insulin resistance in skeletal muscle. . Sci. Rep. 8::2133
    [Crossref] [Google Scholar]
  68. 68.
    Njunge JM, Tickell K, Diallo AH, Sayeem Bin Shahid ASM, Gazi MA, et al. 2022.. The Childhood Acute Illness and Nutrition (CHAIN) Network Nested Case-Cohort Study protocol: a multi-omics approach to understanding mortality among children in sub-Saharan Africa and South Asia. . Gates Open. Res. 6::77
    [Crossref] [Google Scholar]
  69. 69.
    Saw WY, Tantoso E, Begum H, Zhou L, Zou R, et al. 2017.. Establishing multiple omics baselines for three Southeast Asian populations in the Singapore Integrative Omics Study. . Nat. Commun. 8::653
    [Crossref] [Google Scholar]
  70. 70.
    Tryka KA, Hao L, Sturcke A, Jin Y, Wang ZY, et al. 2014.. NCBI's Database of Genotypes and Phenotypes: dbGaP. . Nucleic Acids Res. 42::D97579
    [Crossref] [Google Scholar]
  71. 71.
    Freeberg MA, Fromont LA, D'Altri T, Romero AF, Ciges JI, et al. 2022.. The European Genome-phenome Archive in 2021. . Nucleic Acids Res. 50::D98087
    [Crossref] [Google Scholar]
  72. 72.
    Campagna MP, Xavier A, Lechner-Scott J, Maltby V, Scott RJ, et al. 2021.. Epigenome-wide association studies: current knowledge, strategies and recommendations. . Clin. Epigenetics 13::214
    [Crossref] [Google Scholar]
  73. 73.
    Huckins LM, Dobbyn A, Ruderfer DM, Hoffman G, Wang W, et al. 2019.. Gene expression imputation across multiple brain regions provides insights into schizophrenia risk. . Nat. Genet. 51::65974
    [Crossref] [Google Scholar]
  74. 74.
    Gandal MJ, Zhang P, Hadjimichael E, Walker RL, Chen C, et al. 2018.. Transcriptome-wide isoform-level dysregulation in ASD, schizophrenia, and bipolar disorder. . Science 362::eaat8127
    [Crossref] [Google Scholar]
  75. 75.
    Brandes N, Linial N, Linial M. 2020.. PWAS: proteome-wide association study-linking genes and phenotypes by functional variation in proteins. . Genome Biol. 21::173
    [Crossref] [Google Scholar]
  76. 76.
    Kojouri M, Pinto R, Mustafa R, Huang J, Gao H, et al. 2023.. Metabolome-wide association study on physical activity. . Sci. Rep. 13::2374
    [Crossref] [Google Scholar]
  77. 77.
    Wingo TS, Liu Y, Gerasimov ES, Gockley J, Logsdon BA, et al. 2021.. Brain proteome-wide association study implicates novel proteins in depression pathogenesis. . Nat. Neurosci. 24::81017
    [Crossref] [Google Scholar]
  78. 78.
    Vasaikar SV, Savage AK, Gong Q, Swanson E, Talla A, et al. 2023.. A comprehensive platform for analyzing longitudinal multi-omics data. . Nat. Commun. 14::1684
    [Crossref] [Google Scholar]
  79. 79.
    Baysoy A, Bai Z, Satija R, Fan R. 2023.. The technological landscape and applications of single-cell multi-omics. . Nat. Rev. Mol. Cell Biol. 24::695713
    [Crossref] [Google Scholar]
  80. 80.
    Macaulay IC, Haerty W, Kumar P, Li YI, Hu TX, et al. 2015.. G&T-seq: parallel sequencing of single-cell genomes and transcriptomes. . Nat. Methods 12::51922
    [Crossref] [Google Scholar]
  81. 81.
    Macaulay IC, Ponting CP, Voet T. 2017.. Single-cell multiomics: multiple measurements from single cells. . Trends Genet. 33::15568
    [Crossref] [Google Scholar]
  82. 82.
    Stoeckius M, Hafemeister C, Stephenson W, Houck-Loomis B, Chattopadhyay PK, et al. 2017.. Simultaneous epitope and transcriptome measurement in single cells. . Nat. Methods 14::86568
    [Crossref] [Google Scholar]
  83. 83.
    Choi JR, Yong KW, Choi JY, Cowie AC. 2020.. Single-cell RNA sequencing and its combination with protein and DNA analyses. . Cells 9::1130
    [Crossref] [Google Scholar]
  84. 84.
    Swanson E, Lord C, Reading J, Heubeck AT, Genge PC, et al. 2021.. Simultaneous trimodal single-cell measurement of transcripts, epitopes, and chromatin accessibility using TEA-seq. . eLife 10::e63632
    [Crossref] [Google Scholar]
  85. 85.
    Mimitou EP, Lareau CA, Chen KY, Zorzetto-Fernandes AL, Hao Y, et al. 2021.. Scalable, multimodal profiling of chromatin accessibility, gene expression and protein levels in single cells. . Nat. Biotechnol. 39::124658
    [Crossref] [Google Scholar]
  86. 86.
    Darwiche R, Struhl K. 2020.. Pheno-RNA, a method to associate genes with a specific phenotype, identifies genes linked to cellular transformation. . PNAS 117::2892529
    [Crossref] [Google Scholar]
  87. 87.
    Reel PS, Reel S, Pearson E, Trucco E, Jefferson E. 2021.. Using machine learning approaches for multi-omics data analysis: a review. . Biotechnol. Adv. 49::107739
    [Crossref] [Google Scholar]
  88. 88.
    Vahabi N, Michailidis G. 2022.. Unsupervised multi-omics data integration methods: a comprehensive review. . Front. Genet. 13::854752
    [Crossref] [Google Scholar]
  89. 89.
    Subramanian I, Verma S, Kumar S, Jere A, Anamika K. 2020.. Multi-omics data integration, interpretation, and its application. . Bioinform. Biol. Insights 14::1177932219899051
    [Crossref] [Google Scholar]
  90. 90.
    Bersanelli M, Mosca E, Remondini D, Giampieri E, Sala C, et al. 2016.. Methods for the integration of multi-omics data: mathematical aspects. . BMC Bioinform. 17:(Suppl. 2):15
    [Crossref] [Google Scholar]
  91. 91.
    Kang M, Ko E, Mersha TB. 2022.. A roadmap for multi-omics data integration using deep learning. . Brief. Bioinform. 23::bbab454
    [Crossref] [Google Scholar]
  92. 92.
    Leal LG, David A, Jarvelin MR, Sebert S, Männikkö M, et al. 2019.. Identification of disease-associated loci using machine learning for genotype and network data integration. . Bioinformatics 35::518290
    [Crossref] [Google Scholar]
  93. 93.
    Wang T, Shao W, Huang Z, Tang H, Zhang J, et al. 2021.. MOGONET integrates multi-omics data using graph convolutional networks allowing patient classification and biomarker identification. . Nat. Commun. 12::3445
    [Crossref] [Google Scholar]
  94. 94.
    Wang C, Lue W, Kaalia R, Kumar P, Rajapakse JC. 2022.. Network-based integration of multi-omics data for clinical outcome prediction in neuroblastoma. . Sci. Rep. 12::15425
    [Crossref] [Google Scholar]
  95. 95.
    Wang B, Mezlini AM, Demir F, Fiume M, Tu Z, et al. 2014.. Similarity network fusion for aggregating data types on a genomic scale. . Nat. Methods 11::33337
    [Crossref] [Google Scholar]
  96. 96.
    Skrede OJ, De Raedt S, Kleppe A, Hveem TS, Liestøl K, et al. 2020.. Deep learning for prediction of colorectal cancer outcome: a discovery and validation study. . Lancet 395::35060
    [Crossref] [Google Scholar]
  97. 97.
    Zhao B, Li T, Yang Y, Wang X, Luo T, et al. 2021.. Common genetic variation influencing human white matter microstructure. . Science 372::eabf3736
    [Crossref] [Google Scholar]
  98. 98.
    Zhao B, Luo T, Li T, Li Y, Zhang J, et al. 2019.. Genome-wide association analysis of 19,629 individuals identifies variants influencing regional brain volumes and refines their genetic co-architecture with cognitive and mental health traits. . Nat. Genet. 51::163744
    [Crossref] [Google Scholar]
  99. 99.
    Zhao B, Li T, Smith SM, Xiong D, Wang X, et al. 2022.. Common variants contribute to intrinsic human brain functional networks. . Nat. Genet. 54::50817
    [Crossref] [Google Scholar]
  100. 100.
    Zhao B, Li T, Fan Z, Yang Y, Shu J, et al. 2023.. Heart-brain connections: phenotypic and genetic insights from magnetic resonance images. . Science 380::abn6598
    [Crossref] [Google Scholar]
  101. 101.
    Zhao B, Li Y, Fan Z, Wu Z, Shu J, et al. 2023.. Eye-brain connections revealed by multimodal retinal and brain imaging genetics in the UK Biobank. . medRxiv 2023.02.16.23286035. https://doi.org/10.1101/2023.02.16.23286035
  102. 102.
    Alipanahi B, Hormozdiari F, Behsaz B, Cosentino J, McCaw ZR, et al. 2021.. Large-scale machine-learning-based phenotyping significantly improves genomic discovery for optic nerve head morphology. . Am. J. Hum. Genet. 108::121730
    [Crossref] [Google Scholar]
  103. 103.
    Li Y, Wu X, Yang P, Jiang G, Luo Y. 2022.. Machine learning for lung cancer diagnosis, treatment, and prognosis. . Genom. Proteom. Bioinform. 20::85066
    [Crossref] [Google Scholar]
  104. 104.
    Xu M, Calhoun V, Jiang R, Yan W, Sui J. 2021.. Brain imaging-based machine learning in autism spectrum disorder: methods and applications. . J. Neurosci. Methods 361::109271
    [Crossref] [Google Scholar]
  105. 105.
    Liang X, Fu Y, Cao WT, Wang Z, Zhang K, et al. 2022.. Gut microbiome, cognitive function and brain structure: a multi-omics integration analysis. . Transl. Neurodegener. 11::49
    [Crossref] [Google Scholar]
  106. 106.
    Bodein A, Scott-Boyer MP, Perin O, Lê Cao K-A, Droit A. 2022.. timeOmics: an R package for longitudinal multi-omics data integration. . Bioinformatics 38::57779
    [Crossref] [Google Scholar]
  107. 107.
    Metwally AA, Zhang T, Wu S, Kellogg R, Zhou W, et al. 2022.. Robust identification of temporal biomarkers in longitudinal omics studies. . Bioinformatics 38::380211
    [Crossref] [Google Scholar]
  108. 108.
    Ang JS, Ng KW, Chua FF. 2020.. Modeling time series data with deep learning: a review, analysis, evaluation and future trend. Paper presented at the 8th International Conference on Information Technology and Multimedia (ICIMU), Selangor, Malaysia:. https://ieeexplore.ieee.org/document/9243546
    [Google Scholar]
  109. 109.
    Choi K, Yi J, Park C, Yoon S. 2021.. Deep learning for anomaly detection in time-series data: review, analysis, and guidelines. . IEEE Access 9::12004365
    [Crossref] [Google Scholar]
  110. 110.
    LeCun Y, Bengio Y, Hinton G. 2015.. Deep learning. . Nature 521::43644
    [Crossref] [Google Scholar]
  111. 111.
    Bengio Y, Simard P, Frasconi P. 1994.. Learning long-term dependencies with gradient descent is difficult. . IEEE Trans. Neural Netw. 5::15766
    [Crossref] [Google Scholar]
  112. 112.
    Lee G, Nho K, Kang B, Sohn KA, Kim D. 2019.. Predicting Alzheimer's disease progression using multi-modal deep learning approach. . Sci. Rep. 9::1952
    [Crossref] [Google Scholar]
  113. 113.
    Nguyen M, He T, An L, Alexander DC, Feng J, Yeo BTT. 2020.. Predicting Alzheimer's disease progression using deep recurrent neural networks. . Neuroimage 222::117203
    [Crossref] [Google Scholar]
  114. 114.
    Jung W, Jun E, Suk HI. 2021.. Deep recurrent model for individualized prediction of Alzheimer's disease progression. . Neuroimage 237::118143
    [Crossref] [Google Scholar]
  115. 115.
    Zhao B, Shan Y, Yang Y, Yu Z, Li T, et al. 2021.. Transcriptome-wide association analysis of brain structures yields insights into pleiotropy with complex neuropsychiatric traits. . Nat. Commun. 12::2878
    [Crossref] [Google Scholar]
  116. 116.
    Hu X, Qiao D, Kim W, Moll M, Balte PP, et al. 2022.. Polygenic transcriptome risk scores for COPD and lung function improve cross-ethnic portability of prediction in the NHLBI TOPMed program. . Am. J. Hum. Genet. 109::85770
    [Crossref] [Google Scholar]
  117. 117.
    Liang Y, Pividori M, Manichaikul A, Palmer AA, Cox NJ, et al. 2022.. Polygenic transcriptome risk scores (PTRS) can improve portability of polygenic risk scores across ancestries. . Genome Biol. 23::23
    [Crossref] [Google Scholar]
  118. 118.
    Mattheisen M, Grove J, Als TD, Martin J, Voloudakis G, et al. 2022.. Identification of shared and differentiating genetic architecture for autism spectrum disorder, attention-deficit hyperactivity disorder and case subgroups. . Nat. Genet. 54::147078
    [Crossref] [Google Scholar]
  119. 119.
    Benchek P, Igo RP Jr., Voss-Hoynes H, Wren Y, Miller G, et al. 2021.. Association between genes regulating neural pathways for quantitative traits of speech and language disorders. . NPJ Genom. Med. 6::64
    [Crossref] [Google Scholar]
  120. 120.
    Li L, Chen Z, von Scheidt M, Li S, Steiner A, et al. 2022.. Transcriptome-wide association study of coronary artery disease identifies novel susceptibility genes. . Basic Res. Cardiol. 117::6
    [Crossref] [Google Scholar]
  121. 121.
    Mooney MA, Ryabinin P, Wilmot B, Bhatt P, Mill J, Nigg JT. 2020.. Large epigenome-wide association study of childhood ADHD identifies peripheral DNA methylation associated with disease and polygenic risk burden. . Transl. Psychiatry 10::8
    [Crossref] [Google Scholar]
  122. 122.
    Hesam-Shariati S, Overs BJ, Roberts G, Toma C, Watkeys OJ, et al. 2022.. Epigenetic signatures relating to disease-associated genotypic burden in familial risk of bipolar disorder. . Transl. Psychiatry 12::310
    [Crossref] [Google Scholar]
  123. 123.
    Sekula P, Goek ON, Quaye L, Barrios C, Levey AS, et al. 2016.. A metabolome-wide association study of kidney function and disease in the general population. . J. Am. Soc. Nephrol. 27::117588
    [Crossref] [Google Scholar]
  124. 124.
    Osborn MP, Park Y, Parks MB, Burgess LG, Uppal K, et al. 2013.. Metabolome-wide association study of neovascular age-related macular degeneration. . PLOS ONE 8::e72737
    [Crossref] [Google Scholar]
  125. 125.
    Dehghan A, Pinto RC, Karaman I, Huang J, Durainayagam BR, et al. 2022.. Metabolome-wide association study on ABCA7 indicates a role of ceramide metabolism in Alzheimer's disease. . PNAS 119::e2206083119
    [Crossref] [Google Scholar]
  126. 126.
    Ge A, Sun Y, Kiker T, Zhou Y, Ye K. 2023.. A metabolome-wide Mendelian randomization study prioritizes potential causal circulating metabolites for multiple sclerosis. . J. Neuroimmunol. 379::578105
    [Crossref] [Google Scholar]
  127. 127.
    Vasaikar SV, Straub P, Wang J, Zhang B. 2018.. LinkedOmics: analyzing multi-omics data within and across 32 cancer types. . Nucleic Acids Res. 46::D95663
    [Crossref] [Google Scholar]
  128. 128.
    Khadirnaikar S, Shukla S, Prasanna SRM. 2023.. Machine learning based combination of multi-omics data for subgroup identification in non-small cell lung cancer. . Sci. Rep. 13::4636
    [Crossref] [Google Scholar]
  129. 129.
    Malik V, Kalakoti Y, Sundar D. 2021.. Deep learning assisted multi-omics integration for survival and drug-response prediction in breast cancer. . BMC Genom. 22::214
    [Crossref] [Google Scholar]
  130. 130.
    Dimitrakopoulos C, Hindupur SK, Häfliger L, Behr J, Montazeri H, et al. 2018.. Network-based integration of multi-omics data for prioritizing cancer genes. . Bioinformatics 34::244148
    [Crossref] [Google Scholar]
  131. 131.
    Gilbert JA, Quinn RA, Debelius J, Xu ZZ, Morton J, et al. 2016.. Microbiome-wide association studies link dynamic microbial consortia to disease. . Nature 535::94103
    [Crossref] [Google Scholar]
  132. 132.
    Perez-Garcia J, Espuela-Ortiz A, Hernández-Pérez JM, González-Pérez R, Poza-Guedes P, et al. 2023.. Human genetics influences microbiome composition involved in asthma exacerbations despite inhaled corticosteroid treatment. . J. Allergy Clin. Immunol. 152::799806.e6
    [Crossref] [Google Scholar]
  133. 133.
    Dai H, Hou T, Wang Q, Hou Y, Wang T, et al. 2023.. Causal relationships between the gut microbiome, blood lipids, and heart failure: a Mendelian randomization analysis. . Eur. J. Prev. Cardiol. 30::127482
    [Crossref] [Google Scholar]
  134. 134.
    Li Z, Zhu G, Lei X, Tang L, Kong G, et al. 2023.. Genetic support of the causal association between gut microbiome and COVID-19: a bidirectional Mendelian randomization study. . Front. Immunol. 14::1217615
    [Crossref] [Google Scholar]
  135. 135.
    Tian M, He X, Jin C, He X, Wu S, et al. 2021.. Transpathology: molecular imaging-based pathology. . Eur. J. Nucl. Med. Mol. Imaging 48::233850
    [Crossref] [Google Scholar]
  136. 136.
    Rashid B, Calhoun V. 2020.. Towards a brain-based predictome of mental illness. . Hum. Brain Mapp. 41::3468535
    [Crossref] [Google Scholar]
  137. 137.
    Wu J, Chen Y, Wang P, Caselli RJ, Thompson PM, et al. 2021.. Integrating transcriptomics, genomics, and imaging in Alzheimer's disease: a federated model. . Front. Radiol. 1::777030
    [Crossref] [Google Scholar]
  138. 138.
    Bao J, Wen J, Wen Z, Yang S, Cui Y, et al. 2023.. Brain-wide genome-wide colocalization study for integrating genetics, transcriptomics and brain morphometry in Alzheimer's disease. . Neuroimage 280::120346
    [Crossref] [Google Scholar]
  139. 139.
    Bergenstråhle J, Larsson L, Lundeberg J. 2020.. Seamless integration of image and molecular analysis for spatial transcriptomics workflows. . BMC Genom. 21::482
    [Crossref] [Google Scholar]
  140. 140.
    Johnson KB, Wei WQ, Weeraratne D, Frisse ME, Misulis K, et al. 2021.. Precision medicine, AI, and the future of personalized health care. . Clin. Transl. Sci. 14::8693
    [Crossref] [Google Scholar]
  141. 141.
    Thompson M, Hill BL, Rakocz N, Chiang JN, Geschwind D, et al. 2022.. Methylation risk scores are associated with a collection of phenotypes within electronic health record systems. . NPJ Genom. Med. 7::50
    [Crossref] [Google Scholar]
  142. 142.
    Talmor-Barkan Y, Bar N, Shaul AA, Shahaf N, Godneva A, et al. 2022.. Metabolomic and microbiome profiling reveals personalized risk factors for coronary artery disease. . Nat. Med. 28::295302
    [Crossref] [Google Scholar]
  143. 143.
    Parisot S, Ktena SI, Ferrante E, Lee M, Guerrero R, et al. 2018.. Disease prediction using graph convolutional networks: application to autism spectrum disorder and Alzheimer's disease. . Med. Image Anal. 48::11730
    [Crossref] [Google Scholar]
  144. 144.
    Mathew D, Giles JR, Baxter AE, Oldridge DA, Greenplate AR, et al. 2020.. Deep immune profiling of COVID-19 patients reveals distinct immunotypes with therapeutic implications. . Science 369::eabc8511
    [Crossref] [Google Scholar]
  145. 145.
    OpenAI. 2023.. GPT-4 technical report. . arXiv:2303.08774 [cs.CL]
  146. 146.
    Anil R, Dai AM, Firat O, Johnson M, Lepikhin D, et al. 2023.. PaLM 2 technical report. . arXiv:2305.10403 [cs.CL]
  147. 147.
    Touvron H, Martin L, Stone KR, Albert P, Almahairi A, et al. 2023.. Llama 2: open foundation and fine-tuned chat models. . arXiv:2307.09288 [cs.CL]
  148. 148.
    Li Y, Li Z, Zhang K, Dan R, Jiang S, Zhang Y. 2023.. ChatDoctor: a medical chat model fine-tuned on a large language model meta-AI (LLaMA) using medical domain knowledge. . Cureus 15::e40895
    [Google Scholar]
  149. 149.
    Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, et al. 2023.. Large language models encode clinical knowledge. . Nature 620::17280
    [Crossref] [Google Scholar]
  150. 150.
    Li C, Wong C, Zhang S, Usuyama N, Liu H, et al. 2023.. LLaVA-Med: training a large language-and-vision assistant for biomedicine in one day. . arXiv:2306.00890 [cs.CV]
  151. 151.
    Moor M, Banerjee O, Abad ZSH, Krumholz HM, Leskovec J, et al. 2023.. Foundation models for generalist medical artificial intelligence. . Nature 616::25965
    [Crossref] [Google Scholar]
  152. 152.
    Acosta JN, Falcone GJ, Rajpurkar P, Topol EJ. 2022.. Multimodal biomedical AI. . Nat. Med. 28::177384
    [Crossref] [Google Scholar]
  153. 153.
    Hou W, Ji Z. 2023.. Reference-free and cost-effective automated cell type annotation with GPT-4 in single-cell RNA-seq analysis. . bioRxiv 2023.04.16.537094. https://doi.org/10.1101/2023.04.16.537094
/content/journals/10.1146/annurev-biodatasci-102523-103801
Loading
/content/journals/10.1146/annurev-biodatasci-102523-103801
Loading

Data & Media loading...

Supplemental Materials

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error