Annual Review of Genomics and Human Genetics - Volume 21, 2020
Volume 21, 2020
-
-
The Long Journey from Diagnosis to Therapy
Vol. 21 (2020), pp. 1–13More LessI was honored to be asked by the Editorial Committee of the Annual Review of Genomics and Genetics to write an autobiographical account of my life in science and in genetics in particular. The field has moved from mapping Mendelian disorders 40 years ago to the delivery of effective therapies for some monogenic disorders today. My 40-year journey from diagnosis to therapy for Duchenne muscular dystrophy has depended on collaborations among basic scientists, clinicians, medical charities, genetic counselors, biotech companies, and affected families. The future of human genetics looks even more exciting, with techniques such as single-cell sequencing and somatic cell CRISPR editing opening up opportunities for precision medicine and accelerating progress.
-
-
-
An Accidental Genetic Epidemiologist
Vol. 21 (2020), pp. 15–36More LessI briefly describe my early life and how, through a series of serendipitous events, I became a genetic epidemiologist. I discuss how the Elston–Stewart algorithm was discovered and its contribution to segregation, linkage, and association analysis. New linkage findings and paternity testing resulted from having a genotyping lab. The different meanings of interaction—statistical and biological—are clarified. The computer package S.A.G.E. (Statistical Analysis for Genetic Epidemiology), based on extensive method development over two decades, was conceived in 1986, flourished for 20 years, and is now freely available for use and further development. Finally, I describe methods to estimate and test hypotheses about familial correlations, and point out that the liability model often used to estimate disease heritability estimates the heritability of that liability, rather than of the disease itself, and so can be highly dependent on the assumed distribution of that liability.
-
-
-
Enhancer Predictions and Genome-Wide Regulatory Circuits
Vol. 21 (2020), pp. 37–54More LessSpatiotemporal control of gene expression during development requires orchestrated activities of numerous enhancers, which are cis-regulatory DNA sequences that, when bound by transcription factors, support selective activation or repression of associated genes. Proper activation of enhancers is critical during embryonic development, adult tissue homeostasis, and regeneration, and inappropriate enhancer activity is often associated with pathological conditions such as cancer. Multiple consortia [e.g., the Encyclopedia of DNA Elements (ENCODE) Consortium and National Institutes of Health Roadmap Epigenomics Mapping Consortium] and independent investigators have mapped putative regulatory regions in a large number of cell types and tissues, but the sequence determinants of cell-specific enhancers are not yet fully understood. Machine learning approaches trained on large sets of these regulatory regions can identify core transcription factor binding sites and generate quantitative predictions of enhancer activity and the impact of sequence variants on activity. Here, we review these computational methods in the context of enhancer prediction and gene regulatory network models specifying cell fate.
-
-
-
Progress, Challenges, and Surprises in Annotating the Human Genome
Vol. 21 (2020), pp. 55–79More LessOur understanding of the human genome has continuously expanded since its draft publication in 2001. Over the years, novel assays have allowed us to progressively overlay layers of knowledge above the raw sequence of A's, T's, G's, and C's. The reference human genome sequence is now a complex knowledge base maintained under the shared stewardship of multiple specialist communities. Its complexity stems from the fact that it is simultaneously a template for transcription, a record of evolution, a vehicle for genetics, and a functional molecule. In short, the human genome serves as a frame of reference at the intersection of a diversity of scientific fields. In recent years, the progressive fall in sequencing costs has given increasing importance to the quality of the human reference genome, as hundreds of thousands of individuals are being sequenced yearly, often for clinical applications. Also, novel sequencing-based assays shed light on novel functions of the genome, especially with respect to gene expression regulation. Keeping the human genome annotation up to date and accurate is therefore an ongoing partnership between reference annotation projects and the greater community worldwide.
-
-
-
RNA Conformation Capture by Proximity Ligation
Vol. 21 (2020), pp. 81–100More LessRNA proximity ligation is a set of molecular biology techniques used to analyze the conformations and spatial proximity of RNA molecules within cells. A typical experiment starts with cross-linking of a biological sample using UV light or psoralen, followed by partial fragmentation of RNA, RNA–RNA ligation, library preparation, and high-throughput sequencing. In the past decade, proximity ligation has been used to study structures of individual RNAs, networks of interactions between small RNAs and their targets, and whole RNA–RNA interactomes, in models ranging from bacteria to animal tissues and whole animals. Here, we provide an overview of the field, highlight the main findings, review the recent experimental and computational developments, and provide troubleshooting advice for new users. In the final section, we draw parallels between DNA and RNA proximity ligation and speculate on possible future research directions.
-
-
-
Cell Lineage Tracing and Cellular Diversity in Humans
Vol. 21 (2020), pp. 101–116More LessTracing cell lineages is fundamental for understanding the rules governing development in multicellular organisms and delineating complex biological processes involving the differentiation of multiple cell types with distinct lineage hierarchies. In humans, experimental lineage tracing is unethical, and one has to rely on natural-mutation markers that are created within cells as they proliferate and age. Recent studies have demonstrated that it is now possible to trace lineages in normal, noncancerous cells with a variety of data types using natural variations in the nuclear and mitochondrial DNA as well as variations in DNA methylation status. It is also apparent that the scientific community is on the verge of being able to make a comprehensive and detailed cell lineage map of human embryonic and fetal development. In this review, we discuss the advantages and disadvantages of different approaches and markers for lineage tracing. We also describe the general conceptual design for how to derive a lineage map for humans.
-
-
-
Cultivating DNA Sequencing Technology After the Human Genome Project
Vol. 21 (2020), pp. 117–138More LessWhen the Human Genome Project was completed in 2003, automated Sanger DNA sequencing with fluorescent dye labels was the dominant technology. Several nascent alternative methods based on older ideas that had not been fully developed were the focus of technical researchers and companies. Funding agencies recognized the dynamic nature of technology development and that, beyond the Human Genome Project, there were growing opportunities to deploy DNA sequencing in biological research. Consequently, the National Human Genome Research Institute of the National Institutes of Health created a program—widely known as the Advanced Sequencing Technology Program—that stimulated all stages of development of new DNA sequencing methods, from innovation to advanced manufacturing and production testing, with the goal of reducing the cost of sequencing a human genome first to $100,000 and then to $1,000. The events of this period provide a powerful example of how judicious funding of academic and commercial partners can rapidly advance core technology developments that lead to profound advances across the scientific landscape.
-
-
-
Pangenome Graphs
Vol. 21 (2020), pp. 139–162More LessLow-cost whole-genome assembly has enabled the collection of haplotype-resolved pangenomes for numerous organisms. In turn, this technological change is encouraging the development of methods that can precisely address the sequence and variation described in large collections of related genomes. These approaches often use graphical models of the pangenome to support algorithms for sequence alignment, visualization, functional genomics, and association studies. The additional information provided to these methods by the pangenome allows them to achieve superior performance on a variety of bioinformatic tasks, including read alignment, variant calling, and genotyping. Pangenome graphs stand to become a ubiquitous tool in genomics. Although it is unclear whether they will replace linearreference genomes, their ability to harmoniously relate multiple sequence and coordinate systems will make them useful irrespective of which pangenomic models become most common in the future.
-
-
-
Using Single-Cell and Spatial Transcriptomes to Understand Stem Cell Lineage Specification During Early Embryo Development
Vol. 21 (2020), pp. 163–181More LessEmbryonic development and stem cell differentiation provide a paradigm to understand the molecular regulation of coordinated cell fate determination and the architecture of tissue patterning. Emerging technologies such as single-cell RNA sequencing and spatial transcriptomics are opening new avenues to dissect cell organization, the divergence of morphological and molecular properties, and lineage allocation. Rapid advances in experimental and computational tools have enabled researchers to make many discoveries and revisit old hypotheses. In this review, we describe the use of single-cell RNA sequencing in studies of molecular trajectories and gene regulation networks for stem cell lineages, while highlighting the integratedexperimental and computational analysis of single-cell and spatial transcriptomes in the molecular annotation of tissue lineages and development during postimplantation gastrulation.
-
-
-
The Genomics and Genetics of Oxygen Homeostasis
Vol. 21 (2020), pp. 183–204More LessHuman survival is dependent upon the continuous delivery of O2 to each cell in the body in sufficient amounts to meet metabolic requirements, primarily for ATP generation by oxidative phosphorylation. Hypoxia-inducible factors (HIFs) regulate the transcription of thousands of genes to balance O2 supply and demand. The HIFs are negatively regulated by O2-dependent hydrox-ylation and ubiquitination by prolyl hydroxylase domain (PHD) proteins and the von Hippel–Lindau (VHL) protein. Germline mutations in the genes encoding VHL, HIF-2α, and PHD2 cause hereditary erythrocytosis, which is characterized by polycythemia and pulmonary hypertension and is caused by increased HIF activity. Evolutionary adaptation to life at high altitude is associated with unique genetic variants in the genes encoding HIF-2α and PHD2 that blunt the erythropoietic and pulmonary vascular responses to hypoxia.
-
-
-
The Genetics of Epilepsy
Vol. 21 (2020), pp. 205–230More LessEpilepsy encompasses a group of heterogeneous brain diseases that affect more than 50 million people worldwide. Epilepsy may have discernible structural, infectious, metabolic, and immune etiologies; however, in most people with epilepsy, no obvious cause is identifiable. Based initially on family studies and later on advances in gene sequencing technologies and computational approaches, as well as the establishment of large collaborative initiatives, we now know that genetics plays a much greater role in epilepsy than was previously appreciated. Here, we review the progress in the field of epilepsy genetics and highlight molecular discoveries in the most important epilepsy groups, including those that have been long considered to have a nongenetic cause. We discuss where the field of epilepsy genetics is moving as it enters a new era in which the genetic architecture of common epilepsies is starting to be unraveled.
-
-
-
Twenty-Five Years of Spinal Muscular Atrophy Research: From Phenotype to Genotype to Therapy, and What Comes Next
Vol. 21 (2020), pp. 231–261More LessTwenty-five years ago, the underlying genetic cause for one of the most common and devastating inherited diseases in humans, spinal muscular atrophy (SMA), was identified. Homozygous deletions or, rarely, subtle mutations of SMN1 cause SMA, and the copy number of the nearly identical copy gene SMN2 inversely correlates with disease severity. SMA has become a paradigm and a prime example of a monogenic neurological disorder that can be efficiently ameliorated or nearly cured by novel therapeutic strategies, such as antisense oligonucleotide or gene replacement therapy. These therapies enable infants to survive who might otherwise have died before the age of two and allow individuals who have never been able to sit or walk to do both. The major milestones on the road to these therapies were to understand the genetic cause and splice regulation of SMN genes, the disease's phenotype–genotype variability, the function of the protein and the main affected cellular pathways and tissues, the disease's pathophysiology through research on animal models, the windows of opportunity for efficient treatment, and how and when to treat patients most effectively.This review aims to bridge our knowledge from phenotype to genotype to therapy, not only highlighting the significant advances so far but also speculating about the future of SMA screening and treatment.
-
-
-
The Laminopathies and the Insights They Provide into the Structural and Functional Organization of the Nucleus
Vol. 21 (2020), pp. 263–288More LessIn recent years, our perspective on the cell nucleus has evolved from the view that it is a passive but permeable storage organelle housing the cell's genetic material to an understanding that it is in fact a highly organized, integrative, and dynamic regulatory hub. In particular, the subcompartment at the nuclear periphery, comprising the nuclear envelope and the underlying lamina, is now known to be a critical nexus in the regulation of chromatin organization, transcriptional output, biochemical and mechanosignaling pathways, and, more recently, cytoskeletal organization. We review the various functional roles of the nuclear periphery and their deregulation in diseases of the nuclear envelope, specifically the laminopathies, which, despite their rarity, provide insights into contemporary health-care issues.
-
-
-
Recent Advances in Understanding the Genetic Architecture of Autism
Vol. 21 (2020), pp. 289–304More LessRecent advances in understanding the genetic architecture of autism spectrum disorder have allowed for unprecedented insight into its biological underpinnings. New studies have elucidated the contributions of a variety of forms of genetic variation to autism susceptibility. While the roles of de novo copy number variants and single-nucleotide variants—causing loss-of-function or missense changes—have been increasingly recognized and refined, mosaic single-nucleotide variants have been implicated more recently in some cases. Moreover, inherited variants (including common variants) and, more recently, rare recessive inherited variants have come into greater focus. Finally, noncoding variants—both inherited and de novo—have been implicated in the last few years. This work has revealed a convergence of diverse genetic drivers on common biological pathways and has highlighted the ongoing importance of increasing sample size and experimental innovation. Continuing to synthesize these genetic findings with functional and phenotypic evidence and translating these discoveries to clinical care remain considerable challenges for the field.
-
-
-
Genomic Data Sharing for Novel Mendelian Disease Gene Discovery: The Matchmaker Exchange
Vol. 21 (2020), pp. 305–326More LessIn the last decade, exome and/or genome sequencing has become a common test in the diagnosis of individuals with features of a rare Mendelian disorder. Despite its success, this test leaves the majority of tested individuals undiagnosed. This review describes the Matchmaker Exchange (MME), a federated network established to facilitate the solving of undiagnosed rare-disease cases through data sharing. MME supports genomic matchmaking, the act of connecting two or more parties looking for cases with similar phenotypes and variants in the same candidate genes. An application programming interface currently connects six matchmaker nodes—the Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources (DECIPHER), GeneMatcher, PhenomeCentral, seqr, MyGene2, and the Initiative on Rare and Undiagnosed Diseases (IRUD) Exchange—resulting in a collective data set spanning more than 150,000 cases from more than 11,000 contributors in 88 countries. Here, we describe the successes and challenges of MME, its individual matchmaking nodes, plans for growing the network, and considerations for future directions.
-
-
-
Genomically Aided Diagnosis of Severe Developmental Disorders
Vol. 21 (2020), pp. 327–349More LessOur ability to make accurate and specific genetic diagnoses in individuals with severe developmental disorders has been transformed by data derived from genomic sequencing technologies. These data reveal both the patterns and rates of different mutational mechanisms and identify regions of the human genome with fewer mutations than would be expected. In outbred populations, the most common identifiable cause of severe developmental disorders is de novo mutation affecting the coding region in one of approximately 500 different genes, almost universally showing constraint. Simply combining the location of a de novo genomic event with its predicted consequence on the gene product gives significant diagnostic power. Our knowledge of the diversity of phenotypic consequences associated with comparable diagnostic genotypes at each locus is improving. Computationally useful phenotype data will improve diagnostic interpretation of ultrarare genetic variants and, in the long run, indicate which specific embryonic processes have been perturbed.
-
-
-
New Diagnostic Approaches for Undiagnosed Rare Genetic Diseases
Vol. 21 (2020), pp. 351–372More LessAccurate diagnosis is the cornerstone of medicine; it is essential for informed care and promoting patient and family well-being. However, families with a rare genetic disease (RGD) often spend more than five years on a diagnostic odyssey of specialist visits and invasive testing that is lengthy, costly, and often futile, as 50% of patients do not receive a molecular diagnosis. The current diagnostic paradigm is not well designed for RGDs, especially for patients who remain undiagnosed after the initial set of investigations, and thus requires an expansion of approaches in the clinic. Leveraging opportunities to participate in research programs that utilize new technologies to understand RGDs is an important path forward for patients seeking a diagnosis. Given recent advancements in such technologies and international initiatives, the prospect of identifying a molecular diagnosis for all patients with RGDs has never been so attainable, but achieving this goal will require global cooperation at an unprecedented scale.
-
-
-
Population Screening for Inherited Predisposition to Breast and Ovarian Cancer
Vol. 21 (2020), pp. 373–412More LessThe discovery of genes underlying inherited predisposition to breast and ovarian cancer has revolutionized the ability to identify women at high risk for these diseases before they become affected. Women who are carriers of deleterious variants in these genes can undertake surveillance and prevention measures that have been shown to reduce morbidity and mortality. However, under current strategies, the vast majority of women carriers remain undetected until they become affected. In this review, we show that universal testing, particularly of the BRCA1 and BRCA2 genes, fulfills classical disease screening criteria. This is especially true for BRCA1 and BRCA2 in Ashkenazi Jews but is translatable to all populations and may include additional genes. Utilizing genetic information for large-scale precision prevention requires a paradigmatic shift in health-care delivery. To address this need, we propose a direct-to-patient model, which is increasingly pertinent for fulfilling the promise of utilizing personal genomic information for disease prevention.
-
-
-
Genetic Influences on Disease Subtypes
Andy Dahl, and Noah ZaitlenVol. 21 (2020), pp. 413–435More LessDisease classification, or nosology, was historically driven by careful examination of clinical features of patients. As technologies to measure and understand human phenotypes advanced, so too did classifications of disease, and the advent of genetic data has led to a surge in genetic subtyping in the past decades. Although the fundamental process of refining disease definitions and subtypes is shared across diverse fields, each field is driven by its own goals and technological expertise, leading to inconsistent and conflicting definitions of disease subtypes. Here, we review several classical and recent subtypes and subtyping approaches and provide concrete definitions to delineate subtypes. In particular, we focus on subtypes with distinct causal disease biology, which are of primary interest to scientists, and subtypes with pragmatic medical benefits, which are of primary interest to physicians. We propose genetic heterogeneity as a gold standard for establishing biologically distinct subtypes of complex polygenic disease. We focus especially on methods to find and validate genetic subtypes, emphasizing common pitfalls and how to avoid them.
-
-
-
How Natural Genetic Variation Shapes Behavior
Vol. 21 (2020), pp. 437–463More LessNervous systems allow animals to acutely respond and behaviorally adapt to changes and recurring patterns in their environment at multiple timescales—from milliseconds to years. Behavior is further shaped at intergenerational timescales by genetic variation, drift, and selection. This sophistication and flexibility of behavior makes it challenging to measure behavior consistently in individual subjects and to compare it across individuals. In spite of these challenges, careful behavioral observations in nature and controlled measurements in the laboratory, combined with modern technologies and powerful genetic approaches, have led to important discoveries about the way genetic variation shapes behavior. A critical mass of genes whose variation is known to modulate behavior in nature is finally accumulating, allowing us to recognize emerging patterns. In this review, we first discuss genetic mapping approaches useful for studying behavior. We then survey how variation acts at different levels—in environmental sensation, in internal neuronal circuits, and outside the nervous system altogether—and then discuss the sources and types of molecular variation linked to behavior and the mechanisms that shape such variation. We end by discussing remaining questions in the field.
-
Previous Volumes
-
Volume 25 (2024)
-
Volume 24 (2023)
-
Volume 23 (2022)
-
Volume 22 (2021)
-
Volume 21 (2020)
-
Volume 20 (2019)
-
Volume 19 (2018)
-
Volume 18 (2017)
-
Volume 17 (2016)
-
Volume 16 (2015)
-
Volume 15 (2014)
-
Volume 14 (2013)
-
Volume 13 (2012)
-
Volume 12 (2011)
-
Volume 11 (2010)
-
Volume 10 (2009)
-
Volume 9 (2008)
-
Volume 8 (2007)
-
Volume 7 (2006)
-
Volume 6 (2005)
-
Volume 5 (2004)
-
Volume 4 (2003)
-
Volume 3 (2002)
-
Volume 2 (2001)
-
Volume 1 (2000)
-
Volume 0 (1932)