1932

Abstract

The Human Genome Project was an enormous accomplishment, providing a foundation for countless explorations into the genetics and genomics of the human species. Yet for many years, the human genome reference sequence remained incomplete and lacked representation of human genetic diversity. Recently, two major advances have emerged to address these shortcomings: complete gap-free human genome sequences, such as the one developed by the Telomere-to-Telomere Consortium, and high-quality pangenomes, such as the one developed by the Human Pangenome Reference Consortium. Facilitated by advances in long-read DNA sequencing and genome assembly algorithms, complete human genome sequences resolve regions that have been historically difficult to sequence, including centromeres, telomeres, and segmental duplications. In parallel, pangenomes capture the extensive genetic diversity across populations worldwide. Together, these advances usher in a new era of genomics research, enhancing the accuracy of genomic analysis, paving the path for precision medicine, and contributing to deeper insights into human biology.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-genom-021623-081639
2024-08-27
2024-12-13
Loading full text...

Full text loading...

/deliver/fulltext/genom/25/1/annurev-genom-021623-081639.html?itemId=/content/journals/10.1146/annurev-genom-021623-081639&mimeType=html&fmt=ahah

Literature Cited

  1. 1.
    1000 Genomes Proj. Consort. 2015.. A global reference for human genetic variation. . Nature 526:(7571):6874
    [Crossref] [Google Scholar]
  2. 2.
    Abel HJ, Larson DE, Regier AA, Chiang C, Das I, et al. 2020.. Mapping and characterization of structural variation in 17,795 human genomes. . Nature 583:(7814):8389
    [Crossref] [Google Scholar]
  3. 3.
    Abondio P, Cilli E, Luiselli D. 2023.. Human pangenomics: promises and challenges of a distributed genomic reference. . Life 13:(6):1360
    [Crossref] [Google Scholar]
  4. 4.
    Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, et al. 2000.. The genome sequence of Drosophila melanogaster. . Science 287:(5461):218595
    [Crossref] [Google Scholar]
  5. 5.
    Aganezov S, Yan SM, Soto DC, Kirsche M, Zarate S, et al. 2022.. A complete reference genome improves analysis of human genetic variation. . Science 376:(6588):eabl3533
    [Crossref] [Google Scholar]
  6. 6.
    Almodaresi F, Sarkar H, Srivastava A, Patro R. 2018.. A space and time-efficient index for the compacted colored de Bruijn graph. . Bioinformatics 34:(13):i16977
    [Crossref] [Google Scholar]
  7. 7.
    Alonge M, Wang X, Benoit M, Soyk S, Pereira L, et al. 2020.. Major impacts of widespread structural variation on gene expression and crop improvement in tomato. . Cell 182:(1):14561.e23
    [Crossref] [Google Scholar]
  8. 8.
    Altemose N. 2022.. A classical revival: Human satellite DNAs enter the genomics era. . Semin. Cell Dev. Biol. 128::214
    [Crossref] [Google Scholar]
  9. 9.
    Amberger JS, Bocchini CA, Schiettecatte F, Scott AF, Hamosh A. 2015.. OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders. . Nucleic Acids Res. 43:(D1):D78998
    [Crossref] [Google Scholar]
  10. 10.
    Anderlik M. 2003.. Commercial biobanks and genetic research: ethical and legal issues. . Am. J. Pharmacogenom. 3:(3):20315
    [Crossref] [Google Scholar]
  11. 11.
    Bentley AR, Callier SL, Rotimi CN. 2020.. Evaluating the promise of inclusion of African ancestry populations in genomics. . npj Genom. Med. 5::5
    [Crossref] [Google Scholar]
  12. 12.
    Beskow LM. 2016.. Lessons from HeLa cells: the ethics and policy of biospecimens. . Annu. Rev. Genom. Hum. Genet. 17::395417
    [Crossref] [Google Scholar]
  13. 13.
    Beyter D, Ingimundardottir H, Oddsson A, Eggertsson HP, Bjornsson E, et al. 2021.. Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits. . Nat. Genet. 53:(6):77986
    [Crossref] [Google Scholar]
  14. 14.
    Biddanda A, Rice DP, Novembre J. 2020.. A variant-centric perspective on geographic patterns of human allele frequency variation. . eLife 9::e60107
    [Crossref] [Google Scholar]
  15. 15.
    Bozan I, Achakkagari SR, Anglin NL, Ellis D, Tai HH, Strömvik MV. 2023.. Pangenome analyses reveal impact of transposable elements and ploidy on the evolution of potato species. . PNAS 120:(31):e2211117120
    [Crossref] [Google Scholar]
  16. 16.
    Brandt DYC, Aguiar VRC, Bitarello BD, Nunes K, Goudet J, Meyer D. 2015.. Mapping bias overestimates reference allele frequencies at the HLA genes in the 1000 Genomes Project phase I data. . G3 5:(5):93141
    [Crossref] [Google Scholar]
  17. 17.
    Büchler T, Olbrich J, Ohlebusch E. 2023.. Efficient short read mapping to a pangenome that is represented by a graph of ED strings. . Bioinformatics 39:(5):btad320
    [Crossref] [Google Scholar]
  18. 18.
    Buniello A, MacArthur JAL, Cerezo M, Harris LW, Hayhurst J, et al. 2019.. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. . Nucleic Acids Res. 47:(D1):D100512
    [Crossref] [Google Scholar]
  19. 19.
    Byrska-Bishop M, Evani US, Zhao X, Basile AO, Abel HJ, et al. 2022.. High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios. . Cell 185:(18):342640.e19
    [Crossref] [Google Scholar]
  20. 20.
    Bzikadze AV, Pevzner PA. 2023.. UniAligner: a parameter-free framework for fast sequence alignment. . Nat. Methods 20:(9):134654
    [Crossref] [Google Scholar]
  21. 21.
    Chaisson MJP, Huddleston J, Dennis MY, Sudmant PH, Malig M, et al. 2015.. Resolving the complexity of the human genome using single-molecule sequencing. . Nature 517:(7536):60811
    [Crossref] [Google Scholar]
  22. 22.
    Chen N-C, Solomon B, Mun T, Iyer S, Langmead B. 2021.. Reference flow: reducing reference bias using multiple population genomes. . Genome Biol. 22::8
    [Crossref] [Google Scholar]
  23. 23.
    Chen Q, Zobel J, Verspoor K. 2017.. Duplicates, redundancies and inconsistencies in the primary nucleotide databases: a descriptive study. . Database 2017::btad320
    [Google Scholar]
  24. 24.
    Chen S, Krusche P, Dolzhenko E, Sherman RM, Petrovski R, et al. 2019.. Paragraph: a graph-based structural variant genotyper for short-read sequence data. . Genome Biol. 20::291
    [Crossref] [Google Scholar]
  25. 25.
    Cheng H, Concepcion GT, Feng X, Zhang H, Li H. 2021.. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. . Nat. Methods 18:(2):17075
    [Crossref] [Google Scholar]
  26. 26.
    Cheng H, Jarvis ED, Fedrigo O, Koepfli K-P, Urban L, et al. 2022.. Haplotype-resolved assembly of diploid genomes without parental data. . Nat. Biotechnol. 40:(9):133235
    [Crossref] [Google Scholar]
  27. 27.
    Chin C-S, Behera S, Khalak A, Sedlazeck FJ, Sudmant PH, et al. 2023.. Multiscale analysis of pangenomes enables improved representation of genomic diversity for repetitive and clinically relevant genes. . Nat. Methods 20:(8):121321
    [Crossref] [Google Scholar]
  28. 28.
    Church DM, Schneider VA, Graves T, Auger K, Cunningham F, et al. 2011.. Modernizing reference genome assemblies. . PLOS Biol. 9:(7):e1001091
    [Crossref] [Google Scholar]
  29. 29.
    Cohen ASA, Farrow EG, Abdelmoity AT, Alaimo JT, Amudhavalli SM, et al. 2022.. Genomic answers for children: dynamic analyses of >1000 pediatric rare disease genomes. . Genet. Med. 24:(6):133648
    [Crossref] [Google Scholar]
  30. 30.
    Collins FS, Patrinos A, Jordan E, Chakravarti A, Gesteland R, Walters L. 1998.. New goals for the U.S. Human Genome Project: 1998–2003. . Science 282:(5389):68289
    [Crossref] [Google Scholar]
  31. 31.
    Collins FS, Varmus H. 2015.. A new initiative on precision medicine. . N. Engl. J. Med. 372:(9):79395
    [Crossref] [Google Scholar]
  32. 32.
    Collins RL, Brand H, Karczewski KJ, Zhao X, Alföldi J, et al. 2020.. A structural variation reference for medical and population genetics. . Nature 581:(7809):44451
    [Crossref] [Google Scholar]
  33. 33.
    Comput. Pan-Genom. Consort. 2018.. Computational pan-genomics: status, promises and challenges. . Brief. Bioinform. 19:(1):11835
    [Google Scholar]
  34. 34.
    Couzin-Frankel J. 2010.. DNA returned to tribe, raising questions about consent. . Science 328:(5978):558
    [Crossref] [Google Scholar]
  35. 35.
    Degner JF, Marioni JC, Pai AA, Pickrell JK, Nkadori E, et al. 2009.. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. . Bioinformatics 25:(24):320712
    [Crossref] [Google Scholar]
  36. 36.
    Della Coletta R, Qiu Y, Ou S, Hufford MB, Hirsch CN. 2021.. How the pan-genome is changing crop genomics and improvement. . Genome Biol. 22::3
    [Crossref] [Google Scholar]
  37. 37.
    Deloukas P, Schuler GD, Gyapay G, Beasley EM, Soderlund C, et al. 1998.. A physical map of 30,000 human genes. . Science 282:(5389):74446
    [Crossref] [Google Scholar]
  38. 38.
    Deng L, Xie B, Wang Y, Zhang X, Xu S. 2022.. A protocol for applying a population-specific reference genome assembly to population genetics and medical studies. . STAR Protoc. 3:(2):101440
    [Crossref] [Google Scholar]
  39. 39.
    Deorowicz S, Danek A, Li H. 2023.. AGC: compact representation of assembled genomes with fast queries and updates. . Bioinformatics 39:(3):btad097
    [Crossref] [Google Scholar]
  40. 40.
    Devaney SA, Malerba L, Manson SM. 2020.. The “All of Us” program and Indigenous peoples. . N. Engl. J. Med. 383:(19):189293
    [Crossref] [Google Scholar]
  41. 41.
    Dodson M, Williamson R. 1999.. Indigenous peoples and the morality of the Human Genome Diversity Project. . J. Med. Ethics 25:(2):2048
    [Crossref] [Google Scholar]
  42. 42.
    Dolzhenko E, Deshpande V, Schlesinger F, Krusche P, Petrovski R, et al. 2019.. ExpansionHunter: a sequence-graph-based tool to analyze variation in short tandem repeat regions. . Bioinformatics 35:(22):475456
    [Crossref] [Google Scholar]
  43. 43.
    Ebert P, Audano PA, Zhu Q, Rodriguez-Martin B, Porubsky D, et al. 2021.. Haplotype-resolved diverse human genomes and integrated analysis of structural variation. . Science 372:(6537):eabf7117
    [Crossref] [Google Scholar]
  44. 44.
    Ebler J, Ebert P, Clarke WE, Rausch T, Audano PA, et al. 2022.. Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes. . Nat. Genet. 54:(4):51825
    [Crossref] [Google Scholar]
  45. 45.
    Eizenga JM, Novak AM, Sibbesen JA, Heumos S, Ghaffaari A, et al. 2020.. Pangenome graphs. . Annu. Rev. Genom. Hum. Genet. 21::13962
    [Crossref] [Google Scholar]
  46. 46.
    ENCODE Proj. Consort. 2012.. An integrated encyclopedia of DNA elements in the human genome. . Nature 489:(7414):5774
    [Crossref] [Google Scholar]
  47. 47.
    Fan S, Hansen MEB, Lo Y, Tishkoff SA. 2016.. Going global by adapting local: a review of recent human adaptation. . Science 354:(6308):5459
    [Crossref] [Google Scholar]
  48. 48.
    Fiddes IT, Armstrong J, Diekhans M, Nachtweide S, Kronenberg ZN, et al. 2018.. Comparative Annotation Toolkit (CAT)—simultaneous clade and personal genome annotation. . Genome Res. 28:(7):102938
    [Crossref] [Google Scholar]
  49. 49.
    Frankish A, Diekhans M, Ferreira A-M, Johnson R, Jungreis I, et al. 2019.. GENCODE reference annotation for the human and mouse genomes. . Nucleic Acids Res. 47:(D1):D76673
    [Crossref] [Google Scholar]
  50. 50.
    Gao Y, Yang X, Chen H, Tan X, Yang Z, et al. 2023.. A pangenome reference of 36 Chinese populations. . Nature 619:(7968):11221
    [Crossref] [Google Scholar]
  51. 51.
    Garrison E, Guarracino A, Heumos S, Villani F, Bao Z, et al. 2023.. Building pangenome graphs. . bioRxiv 2023.04.05.535718. https://doi.org/10.1101/2023.04.05.535718
  52. 52.
    Garrison E, Sirén J, Novak AM, Hickey G, Eizenga JM, et al. 2018.. Variation graph toolkit improves read mapping by representing genetic variation in the reference. . Nat. Biotechnol. 36:(9):87579
    [Crossref] [Google Scholar]
  53. 53.
    Gates AJ, Gysi DM, Kellis M, Barabási A-L. 2021.. A wealth of discovery built on the Human Genome Project—by the numbers. . Nature 590:(7845):21215
    [Crossref] [Google Scholar]
  54. 54.
    Gershman A, Sauria MEG, Guitart X, Vollger MR, Hook PW, et al. 2022.. Epigenetic patterns in a complete human genome. . Science 376:(6588):eabj5089
    [Crossref] [Google Scholar]
  55. 55.
    Gibbs RA. 2020.. The Human Genome Project changed everything. . Nat. Rev. Genet. 21:(10):57576
    [Crossref] [Google Scholar]
  56. 56.
    Glick L, Mayrose I. 2023.. The effect of methodological considerations on the construction of gene-based plant pan-genomes. . Genome Biol. Evol. 15:(7):evad121
    [Crossref] [Google Scholar]
  57. 57.
    Glinos DA, Garborcauskas G, Hoffman P, Ehsan N, Jiang L, et al. 2022.. Transcriptome variation in human tissues revealed by long-read sequencing. . Nature 608:(7922):35359
    [Crossref] [Google Scholar]
  58. 58.
    Goodwin S, McPherson JD, McCombie WR. 2016.. Coming of age: ten years of next-generation sequencing technologies. . Nat. Rev. Genet. 17:(6):33351
    [Crossref] [Google Scholar]
  59. 59.
    Greely HT. 2007.. The uneasy ethical and legal underpinnings of large-scale genomic biobanks. . Annu. Rev. Genom. Hum. Genet. 8::34364
    [Crossref] [Google Scholar]
  60. 60.
    Griffiths RC, Marjoram P. 1996.. Ancestral inference from samples of DNA sequences with recombination. . J. Comput. Biol. 3:(4):479502
    [Crossref] [Google Scholar]
  61. 61.
    Groza C, Kwan T, Soranzo N, Pastinen T, Bourque G. 2020.. Personalized and graph genomes reveal missing signal in epigenomic data. . Genome Biol. 21::124
    [Crossref] [Google Scholar]
  62. 62.
    Groza C, Schwendinger-Schreck C, Cheung WA, Farrow EG, Thiffault I, et al. 2023.. Pangenome graphs improve the analysis of rare genetic diseases. . medRxiv 2023.05.31.23290808. https://doi.org/10.1101/2023.05.31.23290808
  63. 63.
    Guarracino A, Buonaiuto S, de Lima LG, Potapova T, Rhie A, et al. 2023.. Recombination between heterologous human acrocentric chromosomes. . Nature 617:(7960):33543
    [Crossref] [Google Scholar]
  64. 64.
    Guarracino A, Heumos S, Nahnsen S, Prins P, Garrison E. 2022.. ODGI: understanding pangenome graphs. . Bioinformatics 38:(13):331926
    [Crossref] [Google Scholar]
  65. 65.
    Gupta PK. 2021.. GWAS for genetics of complex quantitative traits: genome to pangenome and SNPs to SVs and k-mers. . BioEssays 43:(11):e2100109
    [Crossref] [Google Scholar]
  66. 66.
    Hallast P, Ebert P, Loftus M, Yilmaz F, Audano PA, et al. 2023.. Assembly of 43 human Y chromosomes reveals extensive complexity and variation. . Nature 621:(7978):35564
    [Crossref] [Google Scholar]
  67. 67.
    Halldorsson BV, Eggertsson HP, Moore KHS, Hauswedell H, Eiriksson O, et al. 2022.. The sequences of 150,119 genomes in the UK Biobank. . Nature 607:(7920):73240
    [Crossref] [Google Scholar]
  68. 68.
    Hickey G, Monlong J, Ebler J, Novak AM, Eizenga JM, et al. 2023.. Pangenome graph construction from genome alignments with Minigraph-Cactus. . Nat. Biotechnol. 42::66373
    [Crossref] [Google Scholar]
  69. 69.
    Hindorff LA, Bonham VL, Ohno-Machado L. 2018.. Enhancing diversity to reduce health information disparities and build an evidence base for genomic medicine. . Pers. Med. 15:(5):40312
    [Crossref] [Google Scholar]
  70. 70.
    Hirschhorn JN, Daly MJ. 2005.. Genome-wide association studies for common diseases and complex traits. . Nat. Rev. Genet. 6:(2):95108
    [Crossref] [Google Scholar]
  71. 71.
    Ho SS, Urban AE, Mills RE. 2020.. Structural variation in the sequencing era. . Nat. Rev. Genet. 21:(3):17189
    [Crossref] [Google Scholar]
  72. 72.
    Hokin S, Cleary A, Mudge J. 2020.. Disease association with frequented regions of genotype graphs. . medRxiv 2020.09.25.20201640. https://doi.org/10.1101/2020.09.25.20201640
  73. 73.
    Holley G, Melsted P. 2020.. Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs. . Genome Biol. 21::249
    [Crossref] [Google Scholar]
  74. 74.
    Hori Y, Shimamoto A, Kobayashi T. 2021.. The human ribosomal DNA array is composed of highly homogenized tandem clusters. . Genome Res. 31:(11):197182
    [Crossref] [Google Scholar]
  75. 75.
    Hudson RR. 1990.. Gene genealogies and the coalescent process. . In Oxford Surveys in Evolutionary Biology, Vol. 7, ed. D Futuyama, J Antonovics , pp. 144. Oxford, UK:: Oxford Univ. Press
    [Google Scholar]
  76. 76.
    Int. HapMap Consort. 2003.. The International HapMap Project. . Nature 426:(6968):78996
    [Crossref] [Google Scholar]
  77. 77.
    Int. Hum. Genome Seq. Consort. 2004.. Finishing the euchromatic sequence of the human genome. . Nature 431:(7011):93145
    [Crossref] [Google Scholar]
  78. 78.
    Jain M, Koren S, Miga KH, Quick J, Rand AC, et al. 2018.. Nanopore sequencing and assembly of a human genome with ultra-long reads. . Nat. Biotechnol. 36:(4):33845
    [Crossref] [Google Scholar]
  79. 79.
    Jin S, Han Z, Hu Y, Si Z, Dai F, et al. 2023.. Structural variation (SV)-based pan-genome and GWAS reveal the impacts of SVs on the speciation and diversification of allotetraploid cottons. . Mol. Plant. 16:(4):67893
    [Crossref] [Google Scholar]
  80. 80.
    Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, et al. 2020.. The mutational constraint spectrum quantified from variation in 141,456 humans. . Nature 581:(7809):43443
    [Crossref] [Google Scholar]
  81. 81.
    Keinan A, Clark AG. 2012.. Recent explosive human population growth has resulted in an excess of rare genetic variants. . Science 336:(6082):74043
    [Crossref] [Google Scholar]
  82. 82.
    Kille B, Balaji A, Sedlazeck FJ, Nute M, Treangen TJ. 2022.. Multiple genome alignment in the telomere-to-telomere assembly era. . Genome Biol. 23::182
    [Crossref] [Google Scholar]
  83. 83.
    Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. 2019.. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. . Nat. Biotechnol. 37:(8):90715
    [Crossref] [Google Scholar]
  84. 84.
    Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. 2017.. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. . Genome Res. 27:(5):72236
    [Crossref] [Google Scholar]
  85. 85.
    Kovaka S, Ou S, Jenike KM, Schatz MC. 2023.. Approaching complete genomes, transcriptomes and epi-omes with accurate long-read sequencing. . Nat. Methods 20:(1):1216
    [Crossref] [Google Scholar]
  86. 86.
    Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, et al. 2001.. Initial sequencing and analysis of the human genome. . Nature 409:(6822):860921
    [Crossref] [Google Scholar]
  87. 87.
    Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, et al. 2018.. ClinVar: improving access to variant interpretations and supporting evidence. . Nucleic Acids Res. 46:(D1):D106267
    [Crossref] [Google Scholar]
  88. 88.
    Letcher B, Hunt M, Iqbal Z. 2021.. Gramtools enables multiscale variation analysis with genome graphs. . Genome Biol. 22::259
    [Crossref] [Google Scholar]
  89. 89.
    Li H, Feng X, Chu C. 2020.. The design and construction of reference pangenome graphs with minigraph. . Genome Biol. 21::265
    [Crossref] [Google Scholar]
  90. 90.
    Li N, He Q, Wang J, Wang B, Zhao J, et al. 2023.. Super-pangenome analyses highlight genomic diversity and structural variation across wild and cultivated tomato species. . Nat. Genet. 55:(5):85260
    [Crossref] [Google Scholar]
  91. 91.
    Li N, Stephens M. 2003.. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. . Genetics 165:(4):221333
    [Crossref] [Google Scholar]
  92. 92.
    Li Q, Tian S, Yan B, Liu CM, Lam T-W, et al. 2021.. Building a Chinese pan-genome of 486 individuals. . Commun. Biol. 4::1016
    [Crossref] [Google Scholar]
  93. 93.
    Liao W-W, Asri M, Ebler J, Doerr D, Haukness M, et al. 2023.. A draft human pangenome reference. . Nature 617:(7960):31224
    [Crossref] [Google Scholar]
  94. 94.
    Lu T-Y, Hum. Genome Struct. Var. Consort., Chaisson MJP. 2021.. Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs. . Nat. Commun. 12::4250
    [Crossref] [Google Scholar]
  95. 95.
    Lu T-Y, Smaruj PN, Fudenberg G, Mancuso N, Chaisson MJP. 2023.. The motif composition of variable number tandem repeats impacts gene expression. . Genome Res. 33:(4):51124
    [Crossref] [Google Scholar]
  96. 96.
    MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, et al. 2017.. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). . Nucleic Acids Res. 45:(D1):D896901
    [Crossref] [Google Scholar]
  97. 97.
    Mäkinen V, Cazaux B, Equi M, Norri T, Tomescu AI. 2020.. Linear time construction of indexable founder block graphs. . arXiv:2005.09342 [cs.DS]
  98. 98.
    Markello C, Huang C, Rodriguez A, Carroll A, Chang P-C, et al. 2022.. A complete pedigree-based graph workflow for rare candidate variant analysis. . Genome Res. 32:(5):893903
    [Google Scholar]
  99. 99.
    Martin AR, Gignoux CR, Walters RK, Wojcik GL, Neale BM, et al. 2020.. Human demographic history impacts genetic risk prediction across diverse populations. . Am. J. Hum. Genet. 107:(4):78889
    [Crossref] [Google Scholar]
  100. 100.
    Mc Cartney AM, Shafin K, Alonge M, Bzikadze AV, Formenti G, et al. 2022.. Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies. . Nat. Methods 19:(6):68795
    [Crossref] [Google Scholar]
  101. 101.
    Md. Bioinform. Lab. 2023.. HG002. . GitHub. https://github.com/marbl/HG002
    [Google Scholar]
  102. 102.
    Md. Bioinform. Lab. 2023.. Primates. . GitHub. https://github.com/marbl/Primates
    [Google Scholar]
  103. 103.
    Miga KH. 2019.. Centromeric satellite DNAs: hidden sequence variation in the human population. . Genes 10:(5):352
    [Crossref] [Google Scholar]
  104. 104.
    Miga KH, Newton Y, Jain M, Altemose N, Willard HF, Kent WJ. 2014.. Centromere reference models for human chromosomes X and Y satellite arrays. . Genome Res. 24:(4):697707
    [Crossref] [Google Scholar]
  105. 105.
    Miga KH, Wang T. 2021.. The need for a human pangenome reference sequence. . Annu. Rev. Genom. Hum. Genet. 22::81102
    [Crossref] [Google Scholar]
  106. 106.
    Moodley K, Kleinsmidt A. 2021.. Allegations of misuse of African DNA in the UK: Will data protection legislation in South Africa be sufficient to prevent a recurrence?. Dev. World Bioeth. 21:(3):12530
    [Crossref] [Google Scholar]
  107. 107.
    Naish M, Alonge M, Wlodzimierz P, Tock AJ, Abramson BW, et al. 2021.. The genetic and epigenetic landscape of the centromeres. . Science 374:(6569):eabi7489
    [Crossref] [Google Scholar]
  108. 108.
    Nie J, Tellier J, Tarasova I, Nutt SL, Smyth GK. 2023.. The T2T-CHM13 reference genome has more accurate sequences for immunoglobulin genes than GRCh38. . bioRxiv 2023.05.24.542206. https://doi.org/10.1101/2023.05.24.542206
  109. 109.
    Nielsen R, Akey JM, Jakobsson M, Pritchard JK, Tishkoff S, Willerslev E. 2017.. Tracing the peopling of the world through genomics. . Nature 541:(7637):30210
    [Crossref] [Google Scholar]
  110. 110.
    Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, et al. 2022.. The complete sequence of a human genome. . Science 376:(6588):4453
    [Crossref] [Google Scholar]
  111. 111.
    Olson ND, Wagner J, Dwarshuis N, Miga KH, Sedlazeck FJ, et al. 2023.. Variant calling and benchmarking in an era of complete human genome sequences. . Nat. Rev. Genet. 24:(7):46483
    [Crossref] [Google Scholar]
  112. 112.
    Popejoy AB, Fullerton SM. 2016.. Genomics is failing on diversity. . Nature 538:(7624):16164
    [Crossref] [Google Scholar]
  113. 113.
    Poplin R, Chang P-C, Alexander D, Schwartz S, Colthurst T, et al. 2018.. A universal SNP and small-indel variant caller using deep neural networks. . Nat. Biotechnol. 36:(10):98387
    [Crossref] [Google Scholar]
  114. 114.
    Porubsky D, Ebert P, Audano PA, Vollger MR, Harvey WT, et al. 2021.. Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads. . Nat. Biotechnol. 39:(3):302
    [Crossref] [Google Scholar]
  115. 115.
    Porubsky D, Höps W, Ashraf H, Hsieh P, Rodriguez-Martin B, et al. 2022.. Recurrent inversion polymorphisms in humans associate with genetic instability and genomic disorders. . Cell 185:(11):19862005.e26
    [Crossref] [Google Scholar]
  116. 116.
    Qiao Q, Edger PP, Xue L, Qiong L, Lu J, et al. 2021.. Evolutionary history and pan-genome dynamics of strawberry (Fragaria spp.). . PNAS 118:(45):e2105431118
    [Crossref] [Google Scholar]
  117. 117.
    Qin P, Lu H, Du H, Wang H, Chen W, et al. 2021.. Pan-genome analysis of 33 genetically diverse rice accessions reveals hidden genomic variations. . Cell 184:(13):354258.e16
    [Crossref] [Google Scholar]
  118. 118.
    Ranallo-Benavidez TR, Lemmon Z, Soyk S, Aganezov S, Salerno WJ, et al. 2021.. Optimized sample selection for cost-efficient long-read population sequencing. . Genome Res. 31:(5):91018
    [Crossref] [Google Scholar]
  119. 119.
    Rasmussen MD, Hubisz MJ, Gronau I, Siepel A. 2014.. Genome-wide inference of ancestral recombination graphs. . PLOS Genet. 10:(5):e1004342
    [Crossref] [Google Scholar]
  120. 120.
    Rautiainen M, Marschall T. 2020.. GraphAligner: rapid and versatile sequence-to-graph alignment. . Genome Biol. 21::253
    [Crossref] [Google Scholar]
  121. 121.
    Rautiainen M, Nurk S, Walenz BP, Logsdon GA, Porubsky D, et al. 2023.. Telomere-to-telomere assembly of diploid chromosomes with Verkko. . Nat. Biotechnol. 41:(10):147482
    [Crossref] [Google Scholar]
  122. 122.
    Reese F, Williams B, Balderrama-Gutierrez G, Wyman D, Çelik MH, et al. 2023.. The ENCODE4 long-read RNA-seq collection reveals distinct classes of transcript structure diversity. . bioRxiv 2023.05.15.540865. https://doi.org/10.1101/2023.05.15.540865
  123. 123.
    Rhie A, Nurk S, Cechova M, Hoyt SJ, Taylor DJ, et al. 2023.. The complete sequence of a human Y chromosome. . Nature 621:(7978):34454
    [Crossref] [Google Scholar]
  124. 124.
    Roadmap Epigenom. Consort., Kundaje A, Meuleman W, Ernst J, Bilenky M, et al. 2015.. Integrative analysis of 111 reference human epigenomes. . Nature 518:(7539):31730
    [Crossref] [Google Scholar]
  125. 125.
    Rossi M, Oliva M, Langmead B, Gagie T, Boucher C. 2022.. MONI: a pangenomic index for finding maximal exact matches. . J. Comput. Biol. 29:(2):16987
    [Crossref] [Google Scholar]
  126. 126.
    Sankar PL, Parker LS. 2017.. The Precision Medicine Initiative's All of Us Research Program: an agenda for research on its ethical, legal, and social issues. . Genet. Med. 19:(7):74350
    [Crossref] [Google Scholar]
  127. 127.
    Schatz MC, Philippakis AA, Afgan E, Banks E, Carey VJ, et al. 2022.. Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space. . Cell Genom. 2:(1):100085
    [Crossref] [Google Scholar]
  128. 128.
    Schneider VA, Graves-Lindsay T, Howe K, Bouk N, Chen H-C, et al. 2017.. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. . Genome Res. 27:(5):84964
    [Crossref] [Google Scholar]
  129. 129.
    Sedlazeck FJ, Lee H, Darby CA, Schatz MC. 2018.. Piercing the dark matter: bioinformatics of long-range sequencing and mapping. . Nat. Rev. Genet. 19:(6):32946
    [Crossref] [Google Scholar]
  130. 130.
    Sherman RM, Forman J, Antonescu V, Puiu D, Daya M, et al. 2019.. Assembly of a pan-genome from deep sequencing of 910 humans of African descent. . Nat. Genet. 51:(1):3035
    [Crossref] [Google Scholar]
  131. 131.
    Shumate A, Salzberg SL. 2021.. Liftoff: accurate mapping of gene annotations. . Bioinformatics 37:(12):163943
    [Crossref] [Google Scholar]
  132. 132.
    Sibbesen JA, Eizenga JM, Novak AM, Sirén J, Chang X, et al. 2023.. Haplotype-aware pantranscriptome analyses using spliced pangenome graphs. . Nat. Methods 20:(2):23947
    [Crossref] [Google Scholar]
  133. 133.
    Sirén J, Monlong J, Chang X, Novak AM, Eizenga JM, et al. 2021.. Pangenomics enables genotyping of known structural variants in 5202 diverse genomes. . Science 374:(6574):abg8871
    [Crossref] [Google Scholar]
  134. 134.
    Sirén J, Paten B. 2022.. GBZ file format for pangenome graphs. . Bioinformatics 38:(22):501218
    [Crossref] [Google Scholar]
  135. 135.
    Speidel L, Forest M, Shi S, Myers SR. 2019.. A method for genome-wide genealogy estimation for thousands of samples. . Nat. Genet. 51:(9):132129
    [Crossref] [Google Scholar]
  136. 136.
    Staden R. 1979.. A strategy of DNA sequencing employing computer programs. . Nucleic Acids Res. 6:(7):260110
    [Crossref] [Google Scholar]
  137. 137.
    Taliun D, Harris DN, Kessler MD, Carlson J, Szpiech ZA, et al. 2021.. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. . Nature 590:(7845):29099
    [Crossref] [Google Scholar]
  138. 138.
    Thornton H. 2009.. The UK Biobank project: Trust and altruism are alive and well: a model for achieving public support for research using personal data. . Int. J. Surg. 7:(6):5012
    [Crossref] [Google Scholar]
  139. 139.
    Tutton R, Kaye J, Hoeyer K. 2004.. Governing UK Biobank: the importance of ensuring public trust. . Trends Biotechnol. 22:(6):28485
    [Crossref] [Google Scholar]
  140. 140.
    Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, et al. 2001.. The sequence of the human genome. . Science 291:(5507):130451
    [Crossref] [Google Scholar]
  141. 141.
    Vernikos G, Medini D, Riley DR, Tettelin H. 2015.. Ten years of pan-genome analyses. . Curr. Opin. Microbiol. 23::14854
    [Crossref] [Google Scholar]
  142. 142.
    Vollger MR, Guitart X, Dishuck PC, Mercuri L, Harvey WT, et al. 2022.. Segmental duplications and their variation in a complete human genome. . Science 376:(6588):eabj6965
    [Crossref] [Google Scholar]
  143. 143.
    Wagner J, Olson ND, Harris L, McDaniel J, Cheng H, et al. 2022.. Curated variation benchmarks for challenging medically relevant autosomal genes. . Nat. Biotechnol. 40:(5):67280
    [Crossref] [Google Scholar]
  144. 144.
    Wang T, Antonacci-Fulton L, Howe K, Lawson HA, Lucas JK, et al. 2022.. The Human Pangenome Project: a global resource to map genomic diversity. . Nature 604:(7906):43746
    [Crossref] [Google Scholar]
  145. 145.
    Wenger AM, Peluso P, Rowell WJ, Chang P-C, Hall RJ, et al. 2019.. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. . Nat. Biotechnol. 37:(10):115562
    [Crossref] [Google Scholar]
  146. 146.
    Wilkinson MD, Dumontier M, Aalbersberg IJJ, Appleton G, Axton M, et al. 2016.. The FAIR Guiding Principles for scientific data management and stewardship. . Sci. Data 3::160018
    [Crossref] [Google Scholar]
  147. 147.
    Wojcik GL, Graff M, Nishimura KK, Tao R, Haessler J, et al. 2019.. Genetic analyses of diverse populations improves discovery for complex traits. . Nature 570:(7762):51418
    [Crossref] [Google Scholar]
  148. 148.
    Workman RE, Tang AD, Tang PS, Jain M, Tyson JR, et al. 2019.. Nanopore native RNA sequencing of a human poly(A) transcriptome. . Nat. Methods 16:(12):1297305
    [Crossref] [Google Scholar]
  149. 149.
    Yan SM, Sherman RM, Taylor DJ, Nair DR, Bortvin AN, et al. 2021.. Local adaptation and archaic introgression shape global diversity at human structural variant loci. . eLife 10::e67615
    [Crossref] [Google Scholar]
  150. 150.
    Yang T, Liu R, Luo Y, Hu S, Wang D, et al. 2022.. Improved pea reference genome and pan-genome highlight genomic features and evolutionary characteristics. . Nat. Genet. 54:(10):155363
    [Crossref] [Google Scholar]
  151. 151.
    Zhao X, Collins RL, Lee W-P, Weber AM, Jun Y, et al. 2021.. Expectations and blind spots for structural variation detection from long-read assemblies and short-read genome sequencing technologies. . Am. J. Hum. Genet. 108:(5):91928
    [Crossref] [Google Scholar]
/content/journals/10.1146/annurev-genom-021623-081639
Loading
/content/journals/10.1146/annurev-genom-021623-081639
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error