1932

Abstract

Mutations are the driving force of evolution, yet they underlie many diseases, in particular, cancer. They are thought to arise from a combination of stochastic errors in DNA processing, naturally occurring DNA damage (e.g., the spontaneous deamination of methylated CpG sites), replication errors, and dysregulation of DNA repair mechanisms. High-throughput sequencing has made it possible to generate large datasets to study mutational processes in health and disease. Since the emergence of the first mutational process studies in 2012, this field is gaining increasing attention and has already accumulated a host of computational approaches and biomedical applications.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-biodatasci-122320-120920
2021-07-20
2024-12-06
Loading full text...

Full text loading...

/deliver/fulltext/biodatasci/4/1/annurev-biodatasci-122320-120920.html?itemId=/content/journals/10.1146/annurev-biodatasci-122320-120920&mimeType=html&fmt=ahah

Literature Cited

  1. 1. 
    Pinto Y, Gabay O, Arbiza L, Sams AJ, Keinan A, Levanon EY. 2016. Clustered mutations in hominid genome evolution are consistent with APOBEC3G enzymatic activity. Genome Res. 26:579–87
    [Google Scholar]
  2. 2. 
    Nik-Zainal S, Alexandrov LB, Wedge DC, Van Loo P, Greenman CD et al. 2012. Mutational processes molding the genomes of 21 breast cancers. Cell 149:979–93
    [Google Scholar]
  3. 3. 
    Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio S, Behjati S et al. 2013. Signatures of mutational processes in human cancer. Nature 500:415–21
    [Google Scholar]
  4. 4. 
    Alexandrov LB, Nik-Zainal S, Wedge DC, Campbell PJ, Stratton MR. 2013. Deciphering signatures of mutational processes operative in human cancer. Cell Rep. 3:246–59
    [Google Scholar]
  5. 5. 
    Bryant HE, Schultz N, Thomas HD, Parker KM, Flower D et al. 2005. Specific killing of BRCA2-deficient tumours with inhibitors of poly(ADP-ribose) polymerase. Nature 434:913–17
    [Google Scholar]
  6. 6. 
    Davies H, Glodzik D, Morganella S, Yates LR, Staaf J et al. 2017. HRDetect is a predictor of BRCA1 and BRCA2 deficiency based on mutational signatures. Nat. Med. 23:517–25
    [Google Scholar]
  7. 7. 
    Cancer Genome Atlas Res. Netw 2020. Genomic data commons data portal Data Portal, Natl. Cancer Inst. Rockville, MD: https://portal.gdc.cancer.gov/
    [Google Scholar]
  8. 8. 
    ICGC (Int. Cancer Genome Consort.) 2020. ICGC data portal Data Portal, Int. Cancer Genome Consort https://dcc.icgc.org/
    [Google Scholar]
  9. 9. 
    Alexandrov LB, Kim J, Haradhvala NJ, Huang MN, Tian Ng AW et al. 2020. The repertoire of mutational signatures in human cancer. Nature 578:94–101
    [Google Scholar]
  10. 10. 
    COSMIC (Cat. Somat. Mutat. Cancer) 2020. Mutational signatures (v3.1—June 2020) Web Resour., Cat. Somat. Mutat. Cancer https://cancer.sanger.ac.uk/cosmic/signatures
    [Google Scholar]
  11. 11. 
    Harris K, Pritchard JK. 2017. Rapid evolution of the human mutation spectrum. eLife 6:e24284
    [Google Scholar]
  12. 12. 
    Mathieson I, Reich D. 2017. Differences in the rare variant spectrum among human populations. PLOS Genet. 13:e1006581
    [Google Scholar]
  13. 13. 
    Jónsson H, Sulem P, Kehr B, Kristmundsdottir S, Zink F et al. 2017. Parental influence on human germline de novo mutations in 1,548 trios from Iceland. Nature 549:519–22
    [Google Scholar]
  14. 14. 
    Goldmann J, Veltman J, Gilissen C. 2019. De novo mutations reflect development and aging of the human germline. Trends Genet. 35:828–39
    [Google Scholar]
  15. 15. 
    Chintalapati M, Moorjani P. 2020. Evolution of the mutation rate across primates. Curr. Opin. Genet. Dev. 62:58–64
    [Google Scholar]
  16. 16. 
    Gao Z, Moorjani P, Sasani TA, Pedersen BS, Quinlan AR et al. 2019. Overlooked roles of DNA damage and maternal age in generating human germline mutations. PNAS 116:9491–500
    [Google Scholar]
  17. 17. 
    Kessler M, Loesch D, Perry J, Heard-Costae N, Taliung D et al. 2020. De novo mutations across 1,465 diverse genomes reveal mutational insights and reductions in the Amish founder population. PNAS 117:2560–69
    [Google Scholar]
  18. 18. 
    Aggarwala V, Voight BF. 2016. An expanded sequence context model broadly explains variability in polymorphism levels across the human genome. Nat. Genet. 48:349–55
    [Google Scholar]
  19. 19. 
    Carlson J, Locke AE, Flickinger M, Zawistowski M, Levy S et al. 2018. Extremely rare variants reveal patterns of germline mutation rate heterogeneity in humans. Nat. Commun. 9:3753
    [Google Scholar]
  20. 20. 
    Omichessan H, Severi G, Perduca V. 2019. Computational tools to detect signatures of mutational processes in DNA from tumours: a review and empirical comparison of performance. PLOS ONE 14:e0221235
    [Google Scholar]
  21. 21. 
    Lee DD, Seung HS. 1999. Learning the parts of objects by non-negative matrix factorization. Nature 401:788–91
    [Google Scholar]
  22. 22. 
    Lee D, Seung HS. 2001. Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems T Leen, T Dietterich, V Tresp 556–62 https://papers.nips.cc/paper/2000/file/f9d1152547c0bde01830b7e8bd60024c-Paper.pdf
    [Google Scholar]
  23. 23. 
    Vavasis SA. 2009. On the complexity of nonnegative matrix factorization. SIAM J. Optimiz. 20:31364–77
    [Google Scholar]
  24. 24. 
    Arora S, Ge R, Kannan R, Moitra A. 2012. Computing a nonnegative matrix factorization—provably. Proceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing145–62 New York: Assoc. Comput. Mach.
    [Google Scholar]
  25. 25. 
    Cichocki A, Zdunek R, Amari S 2006. Csiszár's divergences for non-negative matrix factorization: family of new algorithms. Independent Component Analysis and Blind Signal Separation J Rosca, D Erdogmus, JC Príncipe, S Haykin 32–39 Berlin: Springer-Verlag
    [Google Scholar]
  26. 26. 
    Cemgil A. 2009. Bayesian inference for nonnegative matrix factorisation models. Comput. Intell. Neurosci. 2009:785152
    [Google Scholar]
  27. 27. 
    Kasar S, Kim J, Improgo R, Tiao G, Polak P et al. 2015. Whole-genome sequencing reveals activation-induced cytidine deaminase signatures during indolent chronic lymphocytic leukaemia evolution. Nat. Commun. 6:8866
    [Google Scholar]
  28. 28. 
    Kim J, Mouw KW, Polak P, Braunstein LZ, Kamburov A et al. 2016. Somatic ERCC2 mutations are associated with a distinct genomic signature in urothelial tumors. Nat. Genet. 48:600–6
    [Google Scholar]
  29. 29. 
    Tan VY, Fvotte C 2013. Automatic relevance determination in nonnegative matrix factorization with the β-divergence. IEEE Trans. Pattern Anal. Mach. Intell. 35:1592–605
    [Google Scholar]
  30. 30. 
    Forbes SA, Beare D, Boutselakis H, Bamford S, Bindal N et al. 2017. COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res. 45:D777–83
    [Google Scholar]
  31. 31. 
    Fischer A, Illingworth CJ, Campbell PJ, Mustonen V. 2013. EMu: probabilistic inference of mutational processes and their localization in the cancer genome. Genome Biol. 14:R39
    [Google Scholar]
  32. 32. 
    Rosales RA, Drummond RD, Valieris R, Dias-Neto E, da Silva IT. 2016. signeR: an empirical Bayesian approach to mutational signature discovery. Bioinformatics 33:8–16
    [Google Scholar]
  33. 33. 
    Covington K, Shinbrot E, Wheeler DA. 2016. Mutation signatures reveal biological processes in human cancer. bioRxiv 036541. https://doi.org/10.1101/036541
    [Crossref]
  34. 34. 
    Goncearenco A, Rager SL, Li M, Sang QX, Rogozin IB, Panchenko AR. 2017. Exploring background mutational processes to decipher cancer genetic heterogeneity. Nucleic Acids Res. 45:W514–22
    [Google Scholar]
  35. 35. 
    Ramazzotti D, Lal A, Liu K, Tibshirani R, Sidow A. 2018. De novo mutational signature discovery in tumor genomes using SparseSignatures. bioRxiv 384834. https://doi.org/10.1101/384834
    [Crossref]
  36. 36. 
    Blei DM, Ng AY, Jordan MI. 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3:993–1022
    [Google Scholar]
  37. 37. 
    Shiraishi Y, Tremmel G, Miyano S, Stephens M. 2015. A simple model-based approach to inferring and visualizing cancer mutation signatures. PLOS Genet. 11:e1005657
    [Google Scholar]
  38. 38. 
    Robinson W, Sharan R, Leiserson MDM 2019. Modeling clinical and molecular covariates of mutational process activity in cancer. Bioinformatics 35:i492–500
    [Google Scholar]
  39. 39. 
    Roberts ME, Stewart BM, Airoldi EM. 2016. A model of text for experimentation in the social sciences. J. Am. Stat. Assoc. 111:515988–1003
    [Google Scholar]
  40. 40. 
    Funnell T, Zhang AW, Grewal D, McKinney S, Bashashati A et al. 2019. Integrated structural variation and point mutation signatures in cancer genomes using correlated topic models. PLOS Comp. Biol. 15:e1006799
    [Google Scholar]
  41. 41. 
    Salomatin K, Yang Y, Lad A 2009. Multi-field correlated topic modeling. Proceedings of the 2009 International Conference on Data Mining C Apte, H Park, K Wang, MJ Zaki 628–37 Philadelphia: SIAM
    [Google Scholar]
  42. 42. 
    Yang Z, Pandey P, Shibata D, Conti DV, Marjoram P, Siegmund KD. 2019. HiLDA: a statistical approach to investigate differences in mutational signatures. PeerJ 7:e7557
    [Google Scholar]
  43. 43. 
    Sason I, Wojtowicz D, Robinson W, Leiserson MDM, Przytycka TM, Sharan R 2020. A sticky multinomial mixture model of strand-coordinated mutational processes in cancer. iScience 23:100900
    [Google Scholar]
  44. 44. 
    Matsutani T, Ueno Y, Fukunaga T, Hamada M. 2019. Discovering novel mutation signatures by latent Dirichlet allocation with variational Bayes inference. Bioinformatics 35:4543–52
    [Google Scholar]
  45. 45. 
    Gilad G, Sason I, Sharan R 2020. An automated approach for determining the number of components in non-negative matrix factorization with application to mutational signature learning. Mach. Learn. Sci. Technol. 2:015013
    [Google Scholar]
  46. 46. 
    Rosenthal R, McGranahan N, Herrero J, Taylor BS, Swanton C. 2016. DeconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution. Genome Biol. 17:31
    [Google Scholar]
  47. 47. 
    Huang X, Wojtowicz D, Przytycka TM. 2018. Detecting presence of mutational signatures in cancer with confidence. Bioinformatics 34:330–37
    [Google Scholar]
  48. 48. 
    Li S, Crawford FW, Gerstein MB. 2020. Using sigLASSO to optimize cancer mutation signatures jointly with sampling likelihood. Nat. Commun. 11:3575
    [Google Scholar]
  49. 49. 
    Fryxell KJ, Moon WJ. 2005. CpG mutation rates in the human genome are highly dependent on local GC content. Mol. Biol. Evol. 22:650–58
    [Google Scholar]
  50. 50. 
    Schuster-Bckler B, Lehner B 2012. Chromatin organization is a major influence on regional mutation rates in human cancer cells. Nature 488:504–7
    [Google Scholar]
  51. 51. 
    Haradhvala N, Polak P, Stojanov P, Covington K, Shinbrot E et al. 2016. Mutational strand asymmetries in cancer genomes reveal mechanisms of DNA damage and repair. Cell 164:538–49
    [Google Scholar]
  52. 52. 
    Stamatoyannopoulos JA, Adzhubei I, Thurman RE, Kryukov GV, Mirkin SM, Sunyaev SR. 2009. Human mutation rate associated with DNA replication timing. Nat. Genet. 41:393–95
    [Google Scholar]
  53. 53. 
    Morganella S, Alexandrov LB, Glodzik D, Zou X, Davies H et al. 2016. The topography of mutational processes in breast cancer genomes. Nat. Commun. 7:11383
    [Google Scholar]
  54. 54. 
    Tomkova M, Tomek J, Kriaucionis S, Schuster-Bckler B 2018. Mutational signature distribution varies with DNA replication timing and strand asymmetry. Genome Biol. 19:129
    [Google Scholar]
  55. 55. 
    Koren A, Polak P, Nemesh J, Michaelson JJ, Sebat J et al. 2012. Differential relationship of DNA replication timing to different forms of human mutation and variation. Am. J. Hum. Genet. 91:1033–40
    [Google Scholar]
  56. 56. 
    Chan CWY, Gu Z, Bieg M, Eils R, Herrmann C. 2019. Impact of cancer mutational signatures on transcription factor motifs in the human genome. BMC Med. Genom. 12:64
    [Google Scholar]
  57. 57. 
    Sabarinathan R, Mularoni L, Deu-Pons J, Gonzalez-Perez A, López-Bigas N. 2016. Nucleotide excision repair is impaired by binding of transcription factors to DNA. Nature 532:264–67
    [Google Scholar]
  58. 58. 
    Pich O, Muiños F, Sabarinathan R, Reyes-Salazar I, Gonzalez-Perez A, López-Bigas N. 2018. Somatic and germline mutation periodicity follow the orientation of the DNA minor groove around nucleosomes. Cell 175:1074–87
    [Google Scholar]
  59. 59. 
    Frigola J, Sabarinathan R, Mularoni L, Muiños F, Gonzalez-Perez A, López-Bigas N. 2017. Reduced mutation rate in exons due to differential mismatch repair. Nat. Genet. 49:1684–92
    [Google Scholar]
  60. 60. 
    Zou X, Morganella S, Glodzik D, Davies H, Li Y et al. 2017. Short inverted repeats contribute to localized mutability in human somatic cells. Nucleic Acids Res. 45:11213–21
    [Google Scholar]
  61. 61. 
    Georgakopoulos-Soares I, Morganella S, Jain N, Hemberg M, Nik-Zainal S. 2018. Noncanonical secondary structures arising from non-B DNA motifs are determinants of mutagenesis. Genome Res. 28:1264–71
    [Google Scholar]
  62. 62. 
    Gonzalez-Perez A, Sabarinathan R, López-Bigas N. 2019. Local determinants of the mutational landscape of the human genome. Cell 177:101–14
    [Google Scholar]
  63. 63. 
    Hodgkinson A, Eyre-Walker A. 2011. Variation in the mutation rate across mammalian genomes. Nat. Rev. Genet. 12:756–66
    [Google Scholar]
  64. 64. 
    Takai D, Jones PA 2002. Comprehensive analysis of CpG islands in human chromosomes 21 and 22. PNAS 99:3740–45
    [Google Scholar]
  65. 65. 
    Gehring JS, Fischer B, Lawrence M, Huber W 2015. SomaticSignatures: inferring mutational signatures from single-nucleotide variants. Bioinformatics 31:3673–75
    [Google Scholar]
  66. 66. 
    Vöhringer H, van Hoeck A, Cuppen E, Gerstung M. 2020. Learning mutational signatures and their multidimensional genomic properties with TensorSignatures. bioRxiv 850453. https://doi.org/10.1101/850453
    [Crossref]
  67. 67. 
    Wojtowicz D, Hoinka J, Amgalan B, Kim YA, Przytycka TM. 2020. RepairSig: deconvolution of DNA damage and repair contributions to the mutational landscape of cancer. bioRxiv 2020.11.21.392878. https://doi.org/10.1101/2020.11.21.392878
    [Crossref]
  68. 68. 
    Supek F, Lehner B. 2017. Clustered mutation signatures reveal that error-prone DNA repair targets mutations to active genes. Cell 170:534–47
    [Google Scholar]
  69. 69. 
    Wojtowicz D, Sason I, Huang X, Kim YA, Leiserson MDM et al. 2019. Hidden Markov models lead to higher resolution maps of mutation signature activity in cancer. Genome Med. 11:49
    [Google Scholar]
  70. 70. 
    Polak P, Kim J, Braunstein LZ, Karlic R, Haradhavala NJ et al. 2017. A mutational signature reveals alterations underlying deficient homologous recombination repair in breast cancer. Nat. Genet. 49:1476–86
    [Google Scholar]
  71. 71. 
    Kim YA, Wojtowicz D, Sarto Basso R, Sason I, Robinson W et al. 2020. Network-based approaches elucidate differences within APOBEC and clock-like signatures in breast cancer. Genome Med. 12:52
    [Google Scholar]
  72. 72. 
    Kim J, Mouw KW, Polak P, Braunstein LZ, Kamburov A et al. 2016. Somatic ERCC2 mutations are associated with a distinct genomic signature in urothelial tumors. Nat. Genet. 48:600–6
    [Google Scholar]
  73. 73. 
    Viel A, Bruselles A, Meccia E, Fornasarig M, Quaia M et al. 2017. A specific mutational signature associated with DNA 8-oxoguanine persistence in MUTYH-defective colorectal cancer. eBioMedicine 20:39–49
    [Google Scholar]
  74. 74. 
    Haradhvala NJ, Kim J, Maruvka YE, Polak P, Rosebrock D et al. 2018. Distinct mutational signatures characterize concurrent loss of polymerase proofreading and mismatch repair. Nat. Commun. 9:1746
    [Google Scholar]
  75. 75. 
    Kim YA, Sarto Basso R, Wojtowicz D, Liu AS, Hochbaum DS et al. 2020. Identifying drug sensitivity subnetworks with NETPHIX. iScience 23:101619
    [Google Scholar]
  76. 76. 
    Swanton C, McGranahan N, Starrett GJ, Harris RS. 2015. APOBEC enzymes: mutagenic fuel for cancer evolution and heterogeneity. Cancer Discov. 5:704–12
    [Google Scholar]
  77. 77. 
    Petljak M, Alexandrov LB, Brammeld JS, Price S, Wedge DC et al. 2019. Characterizing mutational signatures in human cancer cell lines reveals episodic APOBEC mutagenesis. Cell 176:1282–94
    [Google Scholar]
  78. 78. 
    Koh G, Zou X, Nik-Zainal S. 2020. Mutational signatures: experimental design and analytical framework. Genome Biol. 21:37
    [Google Scholar]
  79. 79. 
    Drost J, van Boxtel R, Blokzijl F, Mizutani T, Sasaki N et al. 2017. Use of CRISPR-modified human stem cell organoids to study the origin of mutational signatures in cancer. Science 358:234–38
    [Google Scholar]
  80. 80. 
    Zou X, Owusu M, Harris R, Jackson SP, Loizou JI, Nik-Zainal S. 2018. Validating the concept of mutational signatures with isogenic cell models. Nat. Commun. 9:1744
    [Google Scholar]
  81. 81. 
    Volkova NV, Meier B, Gonzlez-Huici V, Bertolini S, Gonzalez S et al. 2020. Mutational signatures are jointly shaped by DNA damage and repair. Nat. Commun. 11:2169
    [Google Scholar]
  82. 82. 
    Ma J, Setton J, Lee NY, Riaz N, Powell SN. 2018. The therapeutic significance of mutational signatures from DNA repair deficiency in cancer. Nat. Commun. 9:3292
    [Google Scholar]
  83. 83. 
    Alexandrov LB, Jones PH, Wedge DC, Sale JE, Campbell PJ et al. 2015. Clock-like mutational processes in human somatic cells. Nat. Genet. 47:1402–7
    [Google Scholar]
  84. 84. 
    Alexandrov LB, Ju YS, Haase K, Van Loo P, Martincorena I et al. 2016. Mutational signatures associated with tobacco smoking in human cancer. Science 354:618–22
    [Google Scholar]
  85. 85. 
    Wojtowicz D, Leiserson MDM, Sharan R, Przytycka TM 2020. DNA repair footprint uncovers contribution of DNA repair mechanism to mutational signatures. Pac. Symp. Biocomput. 25:262–73
    [Google Scholar]
  86. 86. 
    Alexandrov LB, Jones PH, Wedge DC, Sale JE, Campbell PJ et al. 2015. Clock-like mutational processes in human somatic cells. Nat. Genet. 47:1402–7
    [Google Scholar]
  87. 87. 
    Gerstung M, Jolly C, Leshchiner I, Dentro SC, Gonzalez S et al. 2020. The evolutionary history of 2,658 cancers. Nature 578:122–28
    [Google Scholar]
  88. 88. 
    Rubanova Y, Shi R, Harrigan CF, Li R, Wintersinger J et al. 2020. Reconstructing evolutionary trajectories of mutation signature activities in cancer using TrackSig. Nat. Commun. 11:731
    [Google Scholar]
  89. 89. 
    Harrigan CF, Rubanova Y, Morris Q, Selega A. 2020. TrackSigFreq: subclonal reconstructions based on mutation signatures and allele frequencies. Pac. Symp. Biocomput. 25:238–49
    [Google Scholar]
  90. 90. 
    Abécassis J, Reyal F, Vert JP 2019. CloneSig: joint inference of intra-tumor heterogeneity and signature deconvolution in tumor bulk sequencing data. bioRxiv 825778. https://doi.org/10.1101/825778
    [Crossref]
  91. 91. 
    Christensen S, Leiserson MDM, El-Kebir M. 2020. PhySigs: phylogenetic inference of mutational signature dynamics. Pac. Symp. Biocomput. 25:226–37
    [Google Scholar]
  92. 92. 
    Jamal-Hanjani M, Wilson GA, McGranahan N, Birkbak NJ, Watkins T et al. 2017. Tracking the evolution of non–small-cell lung cancer. N. Engl. J. Med. 376:222109–21
    [Google Scholar]
  93. 93. 
    Halldorsson B, Palsson G, Stefansson O, Jonsson H, Hardarson M et al. 2019. Characterizing mutagenic effects of recombination through a sequence-level genetic map. Science 363:eaau1043
    [Google Scholar]
  94. 94. 
    Rahbari R, Wuster A, Lindsay SJ, Hardwick RJ, Alexandrov LB et al. 2016. Timing, rates and spectra of human germline mutation. Nat. Genet. 48:126–33
    [Google Scholar]
  95. 95. 
    Reik W, Dean W, Walter J. 2001. Epigenetic reprogramming in mammalian development. Science 293:1089–93
    [Google Scholar]
  96. 96. 
    Miller JH. 1985. Mutagenic specificity of ultraviolet light. J. Mol. Biol. 182:45–65
    [Google Scholar]
  97. 97. 
    Skov L, Macià MC, Sveinbjörnsson G, Mafessoni F, Lucotte EA et al. 2020. The nature of Neanderthal introgression revealed by 27,566 Icelandic genomes. Nature 582:78–83
    [Google Scholar]
  98. 98. 
    Mallick S, Li H, Lipson M, Mathieson I, Gymrek M et al. 2016. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538:201–6
    [Google Scholar]
  99. 99. 
    Agarwal I, Przeworski M 2019. Signatures of replication timing, recombination, and sex in the spectrum of rare variants on the human X chromosome and autosomes. PNAS 116:17916–24
    [Google Scholar]
  100. 100. 
    Anderson-Trocmé L, Farouni R, Bourgey M, Kamatani Y, Higasa K et al. 2020. Legacy data confound genomics studies. Mol. Biol. Evol. 37:2–10
    [Google Scholar]
/content/journals/10.1146/annurev-biodatasci-122320-120920
Loading
/content/journals/10.1146/annurev-biodatasci-122320-120920
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error