Beginning in the early 2000s, the accumulation of biospecimens linked to electronic health records (EHRs) made possible genome-phenome studies (i.e., comparative analyses of genetic variants and phenotypes) using only data collected as a by-product of typical health care. In addition to disease and trait genetics, EHRs proved a valuable resource for analyzing pharmacogenetic traits and developing reverse genetics approaches such as phenome-wide association studies (PheWASs). PheWASs are designed to survey which of many phenotypes may be associated with a given genetic variant. PheWAS methods have been validated through replication of hundreds of known genotype-phenotype associations, and their use has differentiated between true pleiotropy and clinical comorbidity, added context to genetic discoveries, and helped define disease subtypes, and may also help repurpose medications. PheWAS methods have also proven to be useful with research-collected data. Future efforts that integrate broad, robust collection of phenotype data (e.g., EHR data) with purpose-collected research data in combination with a greater understanding of EHR data will create a rich resource for increasingly more efficient and detailed genome-phenome analysis to usher in new discoveries in precision medicine.


Article metrics loading...

Loading full text...

Full text loading...


Literature Cited

  1. Boland MR, Hripcsak G, Albers DJ, Wei Y, Wilcox AB. 1.  et al. 2013. Discovering medical conditions associated with periodontitis using linked electronic health records. J. Clin. Periodontol. 40:474–82 [Google Scholar]
  2. Bowton E, Field JR, Wang S, Schildcrout JS, Van Driest SL. 2.  et al. 2014. Biobanks and electronic medical records: enabling cost-effective research. Sci. Transl. Med. 6:234cm3 [Google Scholar]
  3. Cannon CP, Blazing MA, Giugliano RP, McCagg A, White JA. 3.  et al. 2015. Ezetimibe added to statin therapy after acute coronary syndromes. N. Engl. J. Med. 372:2387–97 [Google Scholar]
  4. Carroll RJ, Bastarache L, Denny JC. 4.  2014. R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment. Bioinformatics 30:2375–76 [Google Scholar]
  5. Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. 5.  2001. A simple algorithm for identifying negated findings and diseases in discharge summaries. J. Biomed. Inform. 34:301–10 [Google Scholar]
  6. Chen Z, Chen J, Collins R, Guo Y, Peto R. 6.  et al. 2011. China kadoorie biobank of 0.5 million people: survey methods, baseline characteristics and long-term follow-up. Int. J. Epidemiol. 40:1652–66 [Google Scholar]
  7. Claussnitzer M, Dankel SN, Kim K-H, Quon G, Meuleman W. 7.  et al. 2015. FTO obesity variant circuitry and adipocyte browning in humans. N. Engl. J. Med. 373:895–907 [Google Scholar]
  8. Cohen JC, Boerwinkle E, Mosley TH, Hobbs HH. 8.  2006. Sequence variations in PCSK9, low LDL, and protection against coronary heart disease. N. Engl. J. Med. 354:1264–72 [Google Scholar]
  9. Cowen ME, Dusseau DJ, Toth BG, Guisinger C, Zodet MW, Shyr Y. 9.  1998. Casemix adjustment of managed care claims data using the clinical classification for health policy research method. Med. Care 36:1108–13 [Google Scholar]
  10. Crawford DC, Crosslin DR, Tromp G, Kullo IJ, Kuivaniemi H. 10.  et al. 2014. Emerging progress in genomics—the first seven years. Front. Genet. 5:184 [Google Scholar]
  11. Cronin RM, Field JR, Bradford Y, Shaffer CM, Carroll RJ. 11.  et al. 2014. Phenome-wide association studies demonstrating pleiotropy of genetic variants within FTO with and without adjustment for body mass index. Appl. Genet. Epidemiol. 5:250 [Google Scholar]
  12. Crosslin DR, Carrell DS, Burt A, Kim DS, Underwood JG. 12.  et al. 2015. Genetic variation in the HLA region is associated with susceptibility to herpes zoster. Genes Immun. 16:1–7 [Google Scholar]
  13. Delaney JT, Ramirez AH, Bowton E, Pulley JM, Basford MA. 13.  et al. 2012. Predicting clopidogrel response using DNA samples linked to an electronic health record. Clin. Pharmacol. Ther. 91:257–63 [Google Scholar]
  14. Denny JC. 14.  2012. Chapter 13: Mining electronic health records in the genomics era. PLOS Comput. Biol. 8e1002823 [Google Scholar]
  15. Denny JC, Arndt FV, Dupont WD, Neilson EG. 15.  2008. Increased hospital mortality in patients with bedside hippus. Am. J. Med. 121:239–45 [Google Scholar]
  16. Denny JC, Bastarache L, Ritchie MD, Carroll RJ, Zink R. 16.  et al. 2013. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat. Biotechnol. 31:1102–11 [Google Scholar]
  17. Denny JC, Crawford DC, Ritchie MD, Bielinski SJ, Basford MA. 17.  et al. 2011. Variants near FOXE1 are associated with hypothyroidism and other thyroid conditions: using electronic medical records for genome- and phenome-wide studies. Am. J. Hum. Genet. 89:529–42 [Google Scholar]
  18. Denny JC, Ritchie MD, Basford MA, Pulley JM, Bastarache L. 18.  et al. 2010. PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics 26:1205–10 [Google Scholar]
  19. Denny JC, Spickard A, Johnson KB, Peterson NB, Peterson JF, Miller RA. 19.  2009. Evaluation of a method to identify and categorize section headers in clinical documents. J. Am. Med. Inform. Assoc. 16:806–15 [Google Scholar]
  20. Diogo D, Bastarache L, Liao KP, Graham RR, Fulton RS. 20.  et al. 2015. TYK2 protein-coding variants protect against rheumatoid arthritis and autoimmunity, with no evidence of major pleiotropic effects on non-autoimmune complex traits. PLOS ONE 10:e0122271 [Google Scholar]
  21. Doshi-Velez F, Ge Y, Kohane I. 21.  2014. Comorbidity clusters in autism spectrum disorders: an electronic health record time-series analysis. Pediatrics 133:e54–63 [Google Scholar]
  22. Ghebranious N, McCarty C, Wilke R. 22.  2007. Clinical phenome scanning. Pers. Med. 4:175–82 [Google Scholar]
  23. Gottesman O, Kuivaniemi H, Tromp G, Faucett WA, Li R. 23.  et al. 2013. The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future. Genet. Med. 15:761–71 [Google Scholar]
  24. Grady BJ, Torstenson E, Dudek SM, Giles J, Sexton D, Ritchie MD. 24.  2010. Finding unique filter sets in PLATO: a precursor to efficient interaction analysis in GWAS data. Pac. Symp. Biocomput. 2010:315–26 [Google Scholar]
  25. Hall MA, Verma A, Brown-Gentry KD, Goodloe R, Boston J. 25.  et al. 2014. Detection of pleiotropy through a phenome-wide association study (PheWAS) of epidemiologic data as part of the Environmental Architecture for Genes Linked to Environment (EAGLE) study. PLOS Genet. 10:e1004678 [Google Scholar]
  26. Harkema H, Dowling JN, Thornblade T, Chapman WW. 26.  2009. ConText: an algorithm for determining negation, experiencer, and temporal status from clinical reports. J. Biomed. Inform. 42:839–51 [Google Scholar]
  27. Hebbring SJ, Rastegar-Mojarad M, Ye Z, Mayer J, Jacobson C, Lin S. 27.  2015. Application of clinical text data for phenome-wide association studies (PheWASs). Bioinformatics 31:1981–87 [Google Scholar]
  28. Hebbring SJ, Schrodi SJ, Ye Z, Zhou Z, Page D, Brilliant MH. 28.  2013. A PheWAS approach in studying HLA-DRB1*1501. Genes Immun. 14:187–91 [Google Scholar]
  29. Humphreys BL, Lindberg DA, Schoolman HM, Barnett GO. 29.  1998. The unified medical language system: an informatics research collaboration. J. Am. Med. Inf. Assoc. 5:1–11 [Google Scholar]
  30. Jacobs LC, Liu F, Pardo LM, Hofman A, Uitterlinden AG. 30.  et al. 2015. IRF4, MC1R and TYR genes are risk factors for actinic keratosis independent of skin color. Hum. Mol. Genet. 24:3296–303 [Google Scholar]
  31. Jones R, Pembrey M, Golding J, Herrick D. 31.  2005. The search for genenotype/phenotype associations and the phenome scan. Paediatr. Perinat. Epidemiol. 19:264–75 [Google Scholar]
  32. Karnes JH, Cronin RM, Rollin J, Teumer A, Pouplard C. 32.  et al. 2014. A genome-wide association study of heparin-induced thrombocytopenia using an electronic medical record. Thromb. Haemost. 113:772–81 [Google Scholar]
  33. Karol SE, Yang W, Van Driest SL, Chang TY, Kaste S. 33.  et al. 2015. Genetics of glucocorticoid-associated osteonecrosis in children with acute lymphoblastic leukemia. Blood 126:1770–76 [Google Scholar]
  34. Kawai VK, Cunningham A, Vear SI, Van Driest SL, Oginni A. 34.  et al. 2014. Genotype and risk of major bleeding during warfarin treatment. Pharmacogenomics 15:1973–83 [Google Scholar]
  35. Kho AN, Hayes MG, Rasmussen-Torvik L, Pacheco JA, Thompson WK. 35.  et al. 2012. Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study. J. Am. Med. Inform. Assoc. 19:212–18 [Google Scholar]
  36. Kvale MN, Hesselson S, Hoffmann TJ, Cao Y, Chan D. 36.  et al. 2015. Genotyping informatics and quality control for 100,000 subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. Genetics 200:1051–60 [Google Scholar]
  37. Li L, Cheng W-Y, Glicksberg BS, Gottesman O, Tamler R. 37.  et al. 2015. Identification of type 2 diabetes subgroups through topological analysis of patient similarity. Sci. Transl. Med. 7:311ra174 [Google Scholar]
  38. Liao KP, Cai T, Gainer V, Goryachev S, Zeng-Treitler Q. 38.  et al. 2010. Electronic medical records for discovery research in rheumatoid arthritis. Arthritis Care Res. 62:1120–27 [Google Scholar]
  39. Liao KP, Kurreeman F, Li G, Duclos G, Murphy S. 39.  et al. 2013. Associations of autoantibodies, autoimmune risk alleles, and clinical diagnoses from the electronic medical records in rheumatoid arthritis cases and non-rheumatoid arthritis cases. Arthritis Rheum. 65:571–81 [Google Scholar]
  40. McCarty CA, Chisholm RL, Chute CG, Kullo IJ, Jarvik GP. 40.  et al. 2011. The eMERGE network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med. Genom. 4:13 [Google Scholar]
  41. McDonald CJ. 41.  1976. Protocol-based computer reminders, the quality of care and the non-perfectability of man. N. Engl. J. Med. 295:1351–55 [Google Scholar]
  42. McDonald CJ, Tierney WM. 42.  1988. Computer-stored medical records: their future role in medical practice. JAMA 259:3433–40 [Google Scholar]
  43. McDonald CJ, Wilson GA, McCabe GP. 43.  1980. Physician response to computer reminders. JAMA 244:1579–81 [Google Scholar]
  44. Melton GB, Hripcsak G. 44.  2005. Automated detection of adverse events using natural language processing of discharge summaries. J. Am. Med. Inf. Assoc. 12:448–57 [Google Scholar]
  45. Millard LAC, Davies NM, Timpson NJ, Tilling K, Flach PA, Smith GD. 45.  2015. MR-PheWAS: hypothesis prioritization among potential causal effects of body mass index on many outcomes, using Mendelian randomization. Sci. Rep. 5:16645 [Google Scholar]
  46. Mitchell SL, Hall JB, Goodloe RJ, Boston J, Farber-Eger E. 46.  et al. 2014. Investigating the relationship between mitochondrial genetic variation and cardiovascular-related traits to develop a framework for mitochondrial phenome-wide association studies. BioData Min. 7:6 [Google Scholar]
  47. Moore CB, Verma A, Pendergrass S, Verma SS, Johnson DH. 47.  et al. 2015. Phenome-wide association study relating pretreatment laboratory parameters with human genetic variants in AIDS clinical trials group protocols. Open Forum Infect. Dis. 2:ofu113 [Google Scholar]
  48. Mosley JD, Brittain EL, Loyd JE, Denny JC, Austin ED, Larkin EK. 48.  2015. Letter by Mosley regarding article, “Iron homeostasis and pulmonary hypertension: Iron deficiency leads to pulmonary vascular remodeling in the rat.”. Circ. Res. 117:e56–57 [Google Scholar]
  49. Mosley JD, Shaffer CM, Van Driest SL, Weeke PE, Wells QS. 49.  et al. 2016. A genome-wide association study identifies variants in KCNIP4 associated with ACE inhibitor-induced cough. Pharmacogenom. J. 16231–37 [Google Scholar]
  50. 50. Myocard. Infarct. Genet. Consort. Investig 2014. Inactivating mutations in NPC1L1 and protection from coronary heart disease. N. Engl. J. Med. 371:2072–82 [Google Scholar]
  51. Namjou B, Marsolo K, Caroll RJ, Denny JC, Ritchie MD. 51.  et al. 2014. Phenome-wide association study (PheWAS) in EMR-linked pediatric cohorts, genetically links PLCL1 to speech language development and IL5-IL13 to eosinophilic esophagitis. Front. Genet. 5:401 [Google Scholar]
  52. Namjou B, Marsolo K, Lingren T, Ritchie MD, Verma SS. 52.  et al. 2015. A GWAS study on liver function test using emerge network participants. PLOS ONE 10:e0138677 [Google Scholar]
  53. Neuraz A, Chouchana L, Malamut G, Le Beller C, Roche D. 53.  et al. 2013. Phenome-wide association studies on a quantitative trait: application to TPMT enzyme activity and thiopurine therapy in pharmacogenomics. PLOS Comput. Biol. 9:e1003405 [Google Scholar]
  54. Newton KM, Peissig PL, Kho AN, Bielinski SJ, Berg RL. 54.  et al. 2013. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the emerge network. J. Am. Med. Inform. Assoc. 20:e147–54 [Google Scholar]
  55. 55. Off. Natl. Coord. Health Inf. Technol 2015. CMS Medicare and Medicaid EHR Incentive Program, electronic health record products used for attestation. Data Set, US Dep. Health Hum. Serv., Washington, DC. http://www.healthdata.gov/dataset/cms-medicare-and-medicaid-ehr-incentive-program-electronic-health-record-products-used [Google Scholar]
  56. Okada Y, Wu D, Trynka G, Raj T, Terao C. 56.  et al. 2013. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature 506:736–81 [Google Scholar]
  57. Pathak J, Kiefer RC, Bielinski SJ, Chute CG. 57.  2012. Applying semantic web technologies for phenome-wide scan using an electronic health record linked biobank. J. Biomed. Semant. 3:10 [Google Scholar]
  58. Pendergrass SA, Brown-Gentry K, Dudek S, Frase A, Torstenson ES. 58.  et al. 2013. Phenome-wide association study (PheWAS) for detection of pleiotropy within the Population Architecture Using Genomics and Epidemiology (PAGE) Network. PLOS Genet. 9:e1003087 [Google Scholar]
  59. Pendergrass SA, Brown-Gentry K, Dudek SM, Torstenson ES, Ambite JL. 59.  et al. 2011. The use of phenome-wide association studies (PheWAS) for exploration of novel genotype-phenotype relationships and pleiotropy discovery. Genet. Epidemiol. 35:410–22 [Google Scholar]
  60. Pendergrass SA, Dudek SM, Crawford DC, Ritchie MD. 60.  2012. Visually integrating and exploring high throughput phenome-wide association study (PheWAS) results using PheWAS-View. BioData Min. 5:5 [Google Scholar]
  61. Postmus I, Trompet S, Deshmukh HA, Barnes MR, Li X. 61.  et al. 2014. Pharmacogenetic meta-analysis of genome-wide association studies of LDL cholesterol response to statins. Nat. Commun. 5:5068 [Google Scholar]
  62. 62. Precis. Med. Initiat. Work. Group 2015. The Precision Medicine Initiative Cohort Program—building a research foundation for 21st century medicine. Rep., Precis. Med. Initiat. Work. Group, Natl. Inst. Health, Bethesda, MD. http://acd.od.nih.gov/reports/DRAFT-PMI-WG-Report-9-11-2015-508.pdf [Google Scholar]
  63. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR. 63.  et al. 2007. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81:559–75 [Google Scholar]
  64. Ramirez AH, Shi Y, Schildcrout JS, Delaney JT, Xu H. 64.  et al. 2012. Predicting warfarin dosage in European-Americans and African-Americans using DNA samples linked to an electronic health record. Pharmacogenomics 13:407–18 [Google Scholar]
  65. Rastegar-Mojarad M, Ye Z, Kolesar JM, Hebbring SJ, Lin SM. 65.  2015. Opportunities for drug repositioning from phenome-wide association studies. Nat. Biotechnol. 33:342–45 [Google Scholar]
  66. Ritchie MD, Denny JC, Crawford DC, Ramirez AH, Weiner JB. 66.  et al. 2010. Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record. Am. J. Hum. Genet. 86:560–72 [Google Scholar]
  67. Ritchie MD, Denny JC, Zuvich RL, Crawford DC, Schildcrout JS. 67.  et al. 2013. Genome- and phenome-wide analyses of cardiac conduction identifies markers of arrhythmia risk. Circulation 127:1377–85 [Google Scholar]
  68. Sanseau P, Agarwal P, Barnes MR, Pastinen T, Richards JB. 68.  et al. 2012. Use of genome-wide association studies for drug repositioning. Nat. Biotechnol. 30:317–20 [Google Scholar]
  69. Scannell JW, Blanckley A, Boldon H, Warrington B. 69.  2012. Diagnosing the decline in pharmaceutical R&D efficiency. Nat. Rev. Drug Discov. 11:191–200 [Google Scholar]
  70. Shameer K, Denny JC, Ding K, Jouni H, Crosslin DR. 70.  et al. 2014. A genome- and phenome-wide association study to identify genetic variants influencing platelet count and volume and their pleiotropic effects. Hum. Genet. 133:95–109 [Google Scholar]
  71. Simonti CN, Vernot B, Bastarache L, Bottinger E, Carrell DS. 71.  et al. 2016. The phenotypic legacy of admixture between modern humans and Neandertals. Science 351:737–41 [Google Scholar]
  72. Smemo S, Tena JJ, Kim K-H, Gamazon ER, Sakabe NJ. 72.  et al. 2014. Obesity-associated variants within FTO form long-range functional connections with IRX3. Nature 507:371–75 [Google Scholar]
  73. Sudlow C, Gallacher J, Allen N, Beral V, Burton P. 73.  et al. 2015. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med 12:e1001779 [Google Scholar]
  74. Van Driest SL, McGregor TL, Velez Edwards DR, Saville BR, Kitchner TE. 74.  et al. 2015. Genome-wide association study of serum creatinine levels during vancomycin therapy. PLOS ONE 10:e0127791 [Google Scholar]
  75. Warner JL, Alterovitz G. 75.  2012. Phenome based analysis as a means for discovering context dependent clinical reference ranges. AMIA Annu. Symp. Proc. 2012:1441–49 [Google Scholar]
  76. Warner JL, Alterovitz G, Bodio K, Joyce RM. 76.  2013. External phenome analysis enables a rational federated query strategy to detect changing rates of treatment-related complications associated with multiple myeloma. J. Am. Med. Inform. Assoc. 20:696–99 [Google Scholar]
  77. Warner JL, Zollanvari A, Ding Q, Zhang P, Snyder GM, Alterovitz G. 77.  2013. Temporal phenome analysis of a large electronic health record cohort enables identification of hospital-acquired complications. J. Am. Med. Inform. Assoc. 20:e281–87 [Google Scholar]
  78. Wei W-Q, Cronin RM, Xu H, Lasko TA, Bastarache L, Denny JC. 78.  2013. Development and evaluation of an ensemble resource linking medications to their indications. J. Am. Med. Inform. Assoc. 20:954–61 [Google Scholar]
  79. Wei W-Q, Denny JC. 79.  2015. Extracting research-quality phenotypes from electronic health records to support precision medicine. Genome Med. 7:1–14 [Google Scholar]
  80. Wei W-Q, Teixeira PL, Mo H, Cronin RM, Warner JL, Denny JC. 80.  2016. Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance. J. Am. Med. Inform. Assoc. 23e20–27 [Google Scholar]
  81. Wolfe D, Dudek S, Ritchie MD, Pendergrass SA. 81.  2013. Visualizing genomic information across chromosomes with phenogram. BioData Min. 6:18 [Google Scholar]
  82. Wyatt J. 82.  1994. Clinical data systems, part 1: data and medical records. Lancet 344:1543–47 [Google Scholar]
  83. Xu H, Stenner SP, Doan S, Johnson KB, Waitman LR, Denny JC. 83.  2010. MedEx: a medication information extraction system for clinical narratives. J. Am. Med. Inform. Assoc. 17:19–24 [Google Scholar]
  84. Yao L, Li Y, Ghosh S, Evans JA, Rzhetsky A. 84.  2015. Health ROI as a measure of misalignment of biomedical needs and resources. Nat. Biotechnol. 33:807–11 [Google Scholar]
  85. Ye Z, Mayer J, Ivacic L, Zhou Z, He M. 85.  et al. 2014. Phenome-wide association studies (PheWASs) for functional variants. Eur. J. Hum. Genet. 23:523–29 [Google Scholar]

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error