1932

Abstract

Electronic health records (EHRs) are a rich source of data for researchers, but extracting meaningful information out of this highly complex data source is challenging. Phecodes represent one strategy for defining phenotypes for research using EHR data. They are a high-throughput phenotyping tool based on ICD (International Classification of Diseases) codes that can be used to rapidly define the case/control status of thousands of clinically meaningful diseases and conditions. Phecodes were originally developed to conduct phenome-wide association studies to scan for phenotypic associations with common genetic variants. Since then, phecodes have been used to support a wide range of EHR-based phenotyping methods, including the phenotype risk score. This review aims to comprehensively describe the development, validation, and applications of phecodes and suggest some future directions for phecodes and high-throughput phenotyping.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-biodatasci-122320-112352
2021-07-20
2024-06-25
Loading full text...

Full text loading...

/deliver/fulltext/biodatasci/4/1/annurev-biodatasci-122320-112352.html?itemId=/content/journals/10.1146/annurev-biodatasci-122320-112352&mimeType=html&fmt=ahah

Literature Cited

  1. 1. 
    Leader JB, Pendergrass SA, Verma A, Carey DJ, Hartzel DN et al. 2015. Contrasting association results between existing PheWAS phenotype definition methods and five validated electronic phenotypes. AMIA Annu. Symp. Proc. 2015.824–32
    [Google Scholar]
  2. 2. 
    WHO (World Health Organ.) 2020. International Classification of Diseases (ICD) information sheet. Fact Sheet, World Health Organ https://www.who.int/standards/classifications/classification-of-diseases
  3. 3. 
    Beck DE, Margolin DA. 2007. Physician coding and reimbursement. Ochsner. J. 7:18–15
    [Google Scholar]
  4. 4. 
    WHO (World Health Organ.) 2020. History of the development of the ICD. Fact Sheet, World Health Organ. https://www.who.int/classifications/icd/en/HistoryOfICD.pdf
    [Google Scholar]
  5. 5. 
    Hirsch JA, Nicola G, McGinty G, Liu RW, Barr RM et al. 2016. ICD-10: history and context. Am. J. Neuroradiol. 37:4596–99
    [Google Scholar]
  6. 6. 
    NCHS (Natl. Cent. Health Stat.) 2015. International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) Web Resour., Natl. Cent. Health. Stat Hyattsville, MD: https://www.cdc.gov/nchs/icd/icd9cm.htm
    [Google Scholar]
  7. 7. 
    Manchikanti L, Kaye AD, Singh V, Boswell MV. 2015. The tragedy of the implementation of ICD-10-CM as ICD-10: Is the cart before the horse or is there a tragic paradox of misinformation and ignorance?. Pain Physician 18:4E485–95
    [Google Scholar]
  8. 8. 
    NCHS (Natl. Cent. Health Stat.) 2020. International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) Web Resour., Natl. Cent. Health. Stat. Hyattsville, MD: https://www.cdc.gov/nchs/icd/icd10cm.htm
    [Google Scholar]
  9. 9. 
    Topaz M, Shafran-Topaz L, Bowles KH. 2013. ICD-9 to ICD-10: evolution, revolution, and current debates in the United States. Perspect. Health Inf. Manag. 10:Spring1d
    [Google Scholar]
  10. 10. 
    CMS (Cent. Medicare Medicaid Serv.) 2011. ICD-10-CM/PCS to ICD-9-CM reimbursement mappings User Guide, Cent. Medicare Medicaid Serv. Baltimore, MD: https://www.cms.gov/Medicare/Coding/ICD10/downloads/2011_Reimbursement_Mapping_User_Guide.pdf
    [Google Scholar]
  11. 11. 
    Iezzoni LI. 1990. Using administrative diagnostic data to assess the quality of hospital care: pitfalls and potential of ICD-9-CM. Int. J. Technol. Assess. Health Care 6:2272–81
    [Google Scholar]
  12. 12. 
    Jencks SF. 1992. Accuracy in recorded diagnoses. JAMA 267:162238–39
    [Google Scholar]
  13. 13. 
    Cherkin DC, Deyo RA, Volinn E, Loeser JD. 1992. Use of the International Classification of Diseases (ICD-9-CM) to identify hospitalizations for mechanical low back problems in administrative databases. Spine 17:7817–25
    [Google Scholar]
  14. 14. 
    Deyo RA, Cherkin DC, Ciol MA. 1992. Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. J. Clin. Epidemiol. 45:6613–19
    [Google Scholar]
  15. 15. 
    Alessandrini EA, Alpern ER, Chamberlain JM, Shea JA, Gorelick MH. 2010. A new diagnosis grouping system for child emergency department visits. Acad. Emerg. Med. 17:2204–13
    [Google Scholar]
  16. 16. 
    Rassekh SR, Lorenzi M, Lee L, Devji S, McBride M, Goddard K 2010. Reclassification of ICD-9 codes into meaningful categories for oncology survivorship research. J. Cancer Epidemiol. 2010.569517
    [Google Scholar]
  17. 17. 
    Wu P, Gifford A, Meng X, Li X, Campbell H et al. 2019. Mapping ICD-10 and ICD-10-CM codes to phecodes: workflow development and initial evaluation. JMIR Med. Inform. 7:4e14325
    [Google Scholar]
  18. 18. 
    Cipparone CW, Withiam-Leitch M, Kimminau KS, Fox CH, Singh R, Kahn L. 2015. Inaccuracy of ICD-9 codes for chronic kidney disease: a study from two practice-based research networks (PBRNs). J. Am. Board Fam. Med. 28:5678–82
    [Google Scholar]
  19. 19. 
    Khurshid S, Keaney J, Ellinor PT, Lubitz SA. 2016. A simple and portable algorithm for identifying atrial fibrillation in the electronic medical record. Am. J. Cardiol. 117:2221–25
    [Google Scholar]
  20. 20. 
    Lloyd SS, Rissing JP. 1985. Physician and coding errors in patient records. JAMA 254:101330–36
    [Google Scholar]
  21. 21. 
    Klompas M, Eggleston E, McVetta J, Lazarus R, Li L, Platt R 2013. Automated detection and classification of type 1 versus type 2 diabetes using electronic health record data. Diabetes Care 36:4914–21
    [Google Scholar]
  22. 22. 
    Rhodes ET, Laffel LMB, Gonzalez TV, Ludwig DS. 2007. Accuracy of administrative coding for type 2 diabetes in children, adolescents, and young adults. Diabetes Care 30:1141–43
    [Google Scholar]
  23. 23. 
    Sheshasayee A, Thomas SS. 2017. Implementation of data mining techniques in upcoding fraud detection in the monetary domains. 2017 International Conference on Innovative Mechanisms for Industry Applications (ICIMIA)730–34 New York: IEEE
    [Google Scholar]
  24. 24. 
    Silverman E, Skinner J. 2004. Medicare upcoding and hospital ownership. J. Health Econ. 23:2369–89
    [Google Scholar]
  25. 25. 
    NCHS (Natl. Cent. Health Stat.) 2020. ICD-10-CM official guidelines for coding and reporting: FY 2020 Report. Guidel., Natl. Cent. Health. Stat., Hyattsville, MD: https://www.cdc.gov/nchs/data/icd/10cmguidelines-FY2020_final.pdf
    [Google Scholar]
  26. 26. 
    O'Malley KJ, Cook KF, Price MD, Wildes KR, Hurdle JF, Ashton CM. 2005. Measuring diagnoses: ICD code accuracy. Health Serv. Res. 40:5 Pt. 21620–39
    [Google Scholar]
  27. 27. 
    Bastarache L, Denny JC. 2011. The use of ICD-9 codes in genetic association studies. AMIA Annu. Symp. Proc. 2011.1738
    [Google Scholar]
  28. 28. 
    Ye Z, Mayer J, Ivacic L, Zhou Z, He M et al. 2015. Phenome-wide association studies (PheWASs) for functional variants. Eur. J. Hum. Genet. 23:4523–29
    [Google Scholar]
  29. 29. 
    Verma A, Ritchie MD. 2017. Current scope and challenges in phenome-wide association studies. Curr. Epidemiol. Rep. 4:4321–29
    [Google Scholar]
  30. 30. 
    Carroll RJ, Bastarache L, Denny JC. 2014. R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment. Bioinformatics 30:162375–76
    [Google Scholar]
  31. 31. 
    Sinnott JA, Cai F, Yu S, Hejblum BP, Hong C et al. 2018. PheProb: probabilistic phenotyping using diagnosis codes to improve power for genetic association studies. J. Am. Med. Inform. Assoc. 25:101359–65
    [Google Scholar]
  32. 32. 
    Denny JC, Ritchie MD, Basford MA, Pulley JM, Bastarache L et al. 2010. PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics 26:91205–10
    [Google Scholar]
  33. 33. 
    Pearce N, Checkoway H. 1988. Case-control studies using other diseases as controls: problems of excluding exposure-related diseases. Am. J. Epidemiol. 127:4851–56
    [Google Scholar]
  34. 34. 
    Velentgas P, Dreyer NA, Nourjah P, Smith SR, Torchia MM 2013. Developing a Protocol for Observational Comparative Effectiveness Research: A User's Guide Rockville, MD: Agency Healthc. Res. Quality
    [Google Scholar]
  35. 35. 
    Beeghly-Fadiel A, Giri A, Bastarache L, Pully J, Warner J, Denny J. 2017. ABO blood type and cancer risk: preliminary findings from a phenome analysis. Proc. AACR Annu. Meet. 77:131293 Abstr .)
    [Google Scholar]
  36. 36. 
    Newton KM, Peissig PL, Kho AN, Bielinski SJ, Berg RL et al. 2013. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. J. Am. Med. Inform. Assoc. 20:e1e147–54
    [Google Scholar]
  37. 37. 
    Teixeira PL. 2015. Computational phenotyping and phenome-wide association studies: leveraging machine learning and natural language processing to understand electronic health record data PhD Thesis, Vanderbilt Univ. Nashville, TN:
    [Google Scholar]
  38. 38. 
    Liao KP, Sparks JA, Hejblum BP, Kuo I, Cui J et al. 2017. Phenome-wide association study of autoantibodies to citrullinated and noncitrullinated epitopes in rheumatoid arthritis. Arthritis Rheumatol 69:4742–49
    [Google Scholar]
  39. 39. 
    Ritchie MD, Denny JC, Crawford DC, Ramirez AH, Weiner JB et al. 2010. Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record. Am. J. Hum. Genet. 86:4560–72
    [Google Scholar]
  40. 40. 
    Welter D, MacArthur J, Morales J, Burdett T, Hall P et al. 2014. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res 42:D1D1001–6
    [Google Scholar]
  41. 41. 
    Denny JC, Bastarache L, Ritchie MD, Carroll RJ, Zink R et al. 2013. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat. Biotechnol. 31:121102–11
    [Google Scholar]
  42. 42. 
    Palmer C, Pe'er I 2017. Statistical correction of the Winner's Curse explains replication variability in quantitative trait genome-wide association studies. PLOS Genet 13:7e1006916
    [Google Scholar]
  43. 43. 
    Hughey JJ, Rhoades SD, Fu DY, Bastarache L, Denny JC, Chen Q 2019. Cox regression increases power to detect genotype-phenotype associations in genomic studies using the electronic health record. BMC Genom 20:805
    [Google Scholar]
  44. 44. 
    Chen C-Y, Lee PH, Castro VM, Minnier J, Charney AW et al. 2018. Genetic validation of bipolar disorder identified by automated phenotyping using electronic health records. Transl. Psychiatry 8:86
    [Google Scholar]
  45. 45. 
    Morales J, Welter D, Bowler EH, Cerezo M, Harris LW et al. 2018. A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS Catalog. Genome Biol 19:21
    [Google Scholar]
  46. 46. 
    Neale Lab 2020. UK Biobank results. Web Resour Neale Lab. Cambridge, MA: https://www.nealelab.is
    [Google Scholar]
  47. 47. 
    AHRQ (Agency Healthc. Res. Qual.) 2019. Clinical classifications software (CCS) for ICD-10-PCS (beta version) Web Resour., Agency Healthc. Res. Qual Rockville, MD: https://www.hcup-us.ahrq.gov/toolssoftware/ccs10/ccs10.jsp
    [Google Scholar]
  48. 48. 
    Lu T-H, Jen I, Chou Y-J, Chang H-J 2005. Evaluating the comparability of different grouping schemes for mortality and morbidity. Health Policy 71:2151–59
    [Google Scholar]
  49. 49. 
    Wei W-Q, Bastarache LA, Carroll RJ, Marlo JE, Osterman TJ et al. 2017. Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record. PLOS ONE 12:7e0175508
    [Google Scholar]
  50. 50. 
    Rasmy L, Tiryaki F, Zhou Y, Xiang Y, Tao C et al. 2020. Representation of EHR data for predictive modeling: a comparison between UMLS and other terminologies. J. Am. Med. Inform. Assoc. 27:101593–99
    [Google Scholar]
  51. 51. 
    Zhang L, Zhang Y, Cai T, Ahuja Y, He Z et al. 2019. Automated grouping of medical codes via multiview banded spectral clustering. J. Biomed. Inform. 100:103322
    [Google Scholar]
  52. 52. 
    Denny JC, Bastarache L, Roden DM. 2016. Phenome-wide association studies as a tool to advance precision medicine. Annu. Rev. Genom. Hum. Genet. 17:353–73
    [Google Scholar]
  53. 53. 
    Pendergrass SA, Brown-Gentry K, Dudek SM, Torstenson ES, Ambite JL et al. 2011. The use of phenome-wide association studies (PheWAS) for exploration of novel genotype-phenotype relationships and pleiotropy discovery. Genet. Epidemiol. 35:5410–22
    [Google Scholar]
  54. 54. 
    Safarova MS, Satterfield BA, Fan X, Austin EE, Ye Z et al. 2019. A phenome-wide association study to discover pleiotropic effects of PCSK9, APOB, and LDLR. NPJ Genom. Med. 4:3
    [Google Scholar]
  55. 55. 
    Denny JC, Crawford DC, Ritchie MD, Bielinski SJ, Basford MA et al. 2011. Variants near FOXE1 are associated with hypothyroidism and other thyroid conditions: using electronic medical records for genome- and phenome-wide studies. Am. J. Hum. Genet. 89:4529–42
    [Google Scholar]
  56. 56. 
    Namjou B, Lingren T, Huang Y, Parameswaran S, Cobb BL et al. 2019. GWAS and enrichment analyses of non-alcoholic fatty liver disease identify new trait-associated genes and pathways across eMERGE Network. BMC Med 17:135
    [Google Scholar]
  57. 57. 
    Veatch OJ, Bauer CR, Keenan BT, Josyula NS, Mazzotti DR et al. 2020. Characterization of genetic and phenotypic heterogeneity of obstructive sleep apnea using electronic health records. BMC Med. Genom. 13:105
    [Google Scholar]
  58. 58. 
    Namjou B, Stanaway IB, Lingren T, Mentch FD, Benoit B et al. 2020. Evaluation of the MC4R gene across eMERGE network identifies many unreported obesity-associated variants. Int. J. Obes. 45:155–69
    [Google Scholar]
  59. 59. 
    Klarin D, Verma SS, Judy R, Dikilitas O, Wolford BN et al. 2020. Genetic architecture of abdominal aortic aneurysm in the Million Veteran Program. Circulation 142:1633–46
    [Google Scholar]
  60. 60. 
    Karnes JH, Bastarache L, Shaffer CM, Gaudieri S, Xu Y et al. 2017. Phenome-wide scanning identifies multiple diseases and disease severity phenotypes associated with HLA variants. Sci. Transl. Med. 9:389eaai8708
    [Google Scholar]
  61. 61. 
    Unlu G, Gamazon ER, Qi X, Levic DS, Bastarache L et al. 2019. GRIK5 genetically regulated expression associated with eye and vascular phenomes: discovery through iteration among biobanks, electronic health records, and zebrafish. Am. J. Hum. Genet. 104:3503–19
    [Google Scholar]
  62. 62. 
    Unlu G, Qi X, Gamazon ER, Melville DB, Patel N et al. 2020. Phenome-based approach identifies RIC1-linked Mendelian syndrome through zebrafish models, biobank associations and clinical studies. Nat. Med. 26:98–109
    [Google Scholar]
  63. 63. 
    Roden DM. 2017. Phenome-wide association studies: a new method for functional genomics in humans. J. Physiol. 595:124109–15
    [Google Scholar]
  64. 64. 
    Emdin CA, Khera AV, Chaffin M, Klarin D, Natarajan P et al. 2018. Analysis of predicted loss-of-function variants in UK Biobank identifies variants protective for disease. Nat. Commun. 9:1613
    [Google Scholar]
  65. 65. 
    Millard LAC, Davies NM, Timpson NJ, Tilling K, Flach PA, Smith GD. 2015. MR-PheWAS: hypothesis prioritization among potential causal effects of body mass index on many outcomes, using Mendelian randomization. Sci. Rep. 5:16645
    [Google Scholar]
  66. 66. 
    Robinson JR, Carroll RJ, Bastarache L, Chen Q, Mou Z et al. 2020. Association of genetic risk of obesity with postoperative complications using Mendelian randomization. World J. Surg. 44:84–94
    [Google Scholar]
  67. 67. 
    Rosa M, Chignon A, Li Z, Boulanger M-C, Arsenault BJ et al. 2019. A Mendelian randomization study of IL6 signaling in cardiovascular diseases, immune-related disorders and longevity. NPJ Genom. Med. 4:1–10
    [Google Scholar]
  68. 68. 
    Dashti HS, Cade BE, Stutaite G, Saxena R, Redline S, Karlson EW. 2020. Sleep health, diseases, and pain syndromes: findings from an electronic health record biobank. Sleep 2020.zsaa189
    [Google Scholar]
  69. 69. 
    Pulley JM, Jerome RN, Bernard GR, Shirey-Rice JK, Xu Y, Wilkins CH. 2021. The astounding breadth of health disparity: phenome-wide effects of race on disease risk. J. Natl. Med. Assoc. 113:18794
    [Google Scholar]
  70. 70. 
    Zhang T, Goodman M, Zhu F, Healy B, Carruthers R et al. 2020. Phenome-wide examination of comorbidity burden and multiple sclerosis disease severity. Neurol. Neuroimmunol. Neuroinflamm. 7:6e864
    [Google Scholar]
  71. 71. 
    Cai W, Cagan A, He Z, Ananthakrishnan AN. 2021. A phenome-wide analysis of healthcare costs associated with inflammatory bowel diseases. Dig. Dis. Sci. 66:76067
    [Google Scholar]
  72. 72. 
    Salvatore M, Gu T, Mack JA, Prabhu Sankar S, Patil S et al. 2020. A phenome-wide association study (PheWAS) of COVID-19 outcomes by race using the electronic health records data in Michigan Medicine medRxiv 2020.06.29.20141564. https://doi.org/10.1101/2020.06.29.20141564
    [Crossref] [Google Scholar]
  73. 73. 
    Niarchou M, Lin G, Lense MD, Gordon RL, Davis LK. 2020. The medical signature of Nashville musicians: a phenome-wide association study using Vanderbilt's electronic health record database medRxiv 2020.08.14.20175109. https://doi.org/10.1101/2020.08.14.20175109
    [Crossref] [Google Scholar]
  74. 74. 
    Hebbring SJ, Schrodi SJ, Ye Z, Zhou Z, Page D, Brilliant MH 2013. A PheWAS approach in studying HLA-DRB1*1501. Genes Immunity 14:3187–91
    [Google Scholar]
  75. 75. 
    Verma A, Lucas A, Verma SS, Zhang Y, Josyula N et al. 2018. PheWAS and beyond: the landscape of associations with medical diagnoses and clinical measures across 38,662 individuals from Geisinger. Am. J. Hum. Genet. 102:4592–608
    [Google Scholar]
  76. 76. 
    Zhao L, Batta I, Matloff W, O'Driscoll C, Hobel S, Toga AW 2021. Neuroimaging PheWAS (phenome-wide association study): a free cloud-computing platform for big-data, brain-wide imaging association studies. Neuroinformatics 19:285303
    [Google Scholar]
  77. 77. 
    Schraw JM, Langlois PH, Lupo PJ. 2020. Comprehensive assessment of the associations between maternal diabetes and structural birth defects in offspring: a phenome-wide association study. Ann. Epidemiol. 53:14–20.e8
    [Google Scholar]
  78. 78. 
    Goldstein JA, Weinstock JS, Bastarache LA, Larach DB, Fritsche LG et al. 2020. LabWAS: novel findings and study design recommendations from a meta-analysis of clinical labs in two independent biobanks. PLOS Genet 16:11e1009077
    [Google Scholar]
  79. 79. 
    Diogo D, Tian C, Franklin CS, Alanne-Kinnunen M, March M et al. 2018. Phenome-wide association studies across large population cohorts support drug target validation. Nat. Commun. 9:4285
    [Google Scholar]
  80. 80. 
    Boland MR, Alur-Gupta S, Levine L, Gabriel P, Gonzalez-Hernandez G. 2019. Disease associations depend on visit type: results from a visit-wide association study. BioData Min 12:15
    [Google Scholar]
  81. 81. 
    McCarty CA, Chisholm RL, Chute CG, Kullo IJ, Jarvik GP et al. 2011. The eMERGE Network: A consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med. Genom. 4:13
    [Google Scholar]
  82. 82. 
    McCarty CA, Wilke RA, Giampietro PF, Wesbrook SD, Caldwell MD. 2005. Marshfield Clinic Personalized Medicine Research Project (PMRP): design, methods and recruitment for a large population-based biobank. Pers. Med. 2:149–79
    [Google Scholar]
  83. 83. 
    Dewey FE, Murray MF, Overton JD, Habegger L, Leader JB et al. 2016. Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science 354:6319aaf6814
    [Google Scholar]
  84. 84. 
    All Us Res. Prog. Investig 2019. The “All of Us” Research Program. N. Engl. J. Med 381:7668–76
    [Google Scholar]
  85. 85. 
    Zouk H, Venner E, Lennon NJ, Muzny DM, Abrams D et al. 2019. Harmonizing clinical sequencing and interpretation for the eMERGE III Network. Am. J. Hum. Genet. 105:3588–605
    [Google Scholar]
  86. 86. 
    Amberger J, Bocchini CA, Scott AF, Hamosh A. 2009. McKusick's Online Mendelian Inheritance in Man (OMIM®). Nucleic Acids Res 37:Suppl. 1D793–96
    [Google Scholar]
  87. 87. 
    Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA. 2005. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 33:Suppl. 1D514–17
    [Google Scholar]
  88. 88. 
    Antonarakis SE, McKusick VA. 2000. OMIM passes the 1,000-disease-gene mark. Nat. Genet. 25:11
    [Google Scholar]
  89. 89. 
    Amberger JS, Bocchini CA, Scott AF, Hamosh A. 2019. OMIM.org: leveraging knowledge across phenotype–gene relationships. Nucleic Acids Res 47:D1D1038–43
    [Google Scholar]
  90. 90. 
    Randal J. 1962. For basic look at heredity. Newark Evening News Aug. 15
    [Google Scholar]
  91. 91. 
    Rehm HL. 2017. The MedSeq and BabySeq studies: integrating genomics into the practice of medicine. Pathology 49:S32
    [Google Scholar]
  92. 92. 
    Motulsky AG. 2006. Genetics of complex diseases. J. Zhejiang Univ. Sci. B. 7:2167–68
    [Google Scholar]
  93. 93. 
    Ikegawa S. 2012. A short history of the genome-wide association study: where we were and where we are going. Genom. Inform. 10:4220–25
    [Google Scholar]
  94. 94. 
    Gallagher MD, Chen-Plotkin AS. 2018. The post-GWAS era: from association to function. Am. J. Hum. Genet. 102:5717–30
    [Google Scholar]
  95. 95. 
    Frayling T. 2014. Genome-wide association studies: the good, the bad and the ugly. Clin. Med. 14:4428–31
    [Google Scholar]
  96. 96. 
    Katsanis N. 2016. The continuum of causality in human genetic disorders. Genome Biol 17:233
    [Google Scholar]
  97. 97. 
    Blair DR, Lyttle CS, Mortensen JM, Bearden CF, Jensen AB et al. 2013. A nondegenerate code of deleterious variants in Mendelian loci contributes to complex disease risk. Cell 155:170–80
    [Google Scholar]
  98. 98. 
    Robinson PN, Mundlos S. 2010. The Human Phenotype Ontology. Clin. Genet. 77:6525–34
    [Google Scholar]
  99. 99. 
    Groza T, Köhler S, Moldenhauer D, Vasilevsky N, Baynam G et al. 2015. The Human Phenotype Ontology: semantic unification of common and rare disease. Am. J. Hum. Genet. 97:1111–24
    [Google Scholar]
  100. 100. 
    Köhler S, Vasilevsky NA, Engelstad M, Foster E, McMurry J et al. 2017. The Human Phenotype Ontology in 2017. Nucleic Acids Res 45:D1D865–76
    [Google Scholar]
  101. 101. 
    Bastarache L, Hughey JJ, Hebbring S, Marlo J, Zhao W et al. 2018. Phenotype risk scores identify patients with unrecognized Mendelian disease patterns. Science 359:63811233–39
    [Google Scholar]
  102. 102. 
    Wallis C. 1997. Diagnosing cystic fibrosis: blood, sweat, and tears. Arch. Dis. Child. 76:285–88
    [Google Scholar]
  103. 103. 
    Schacherer J. 2016. Beyond the simplicity of Mendelian inheritance. C. R. Biol. 339:7284–88
    [Google Scholar]
  104. 104. 
    Wells BJ, Chagin KM, Nowacki AS, Kattan MW. 2013. Strategies for handling missing data in electronic health record derived data. eGEMs 1:37
    [Google Scholar]
  105. 105. 
    Haneuse S, Bogart A, Jazic I, Westbrook EO, Boudreau D et al. 2016. Learning about missing data mechanisms in electronic health records-based research: a survey-based approach. Epidemiology 27:182–90
    [Google Scholar]
  106. 106. 
    Ye Zi, Kullo Iftikhar J 2018. A phenotype risk score for monogenic aortopathy is associated with dilatation of the thoracic aorta and risk of adverse aortic events. Circulation 138:Suppl. 1A12500 (Abstr.)
    [Google Scholar]
  107. 107. 
    Zhong X, Yin Z, Jia G, Zhou D, Wei Q et al. 2020. Electronic health record phenotypes associated with genetically regulated expression of CFTR and application to cystic fibrosis. Genet. Med. 22:71191–200
    [Google Scholar]
  108. 108. 
    Salvatore M, Beesley LJ, Fritsche LG, Hanauer D, Shi X et al. 2020. Phenotype risk scores (PheRS) for pancreatic cancer using time-stamped electronic health record data: discovery and validation in two large biobanks. J. Biomed. Inform. 113:103652
    [Google Scholar]
  109. 109. 
    Lebovitch D, Johnson J, Duenas H, Stahl E, Charney A, Huckins L. 2019. Construction of a phenotype risk score for MDD. Eur. Neuropsychopharmacol. 29:S138
    [Google Scholar]
  110. 110. 
    Bastarache L, Hughey JJ, Goldstein JA, Bastraache JA, Das S et al. 2019. Improving the phenotype risk score as a scalable approach to identifying patients with Mendelian disease. J. Am. Med. Inform. Assoc. 26:121437–47
    [Google Scholar]
  111. 111. 
    Yu S, Ma Y, Gronsbell J, Cai T, Ananthakrishnan AN et al. 2018. Enabling phenotypic big data with PheNorm. J. Am. Med. Inform. Assoc. 25:154–60
    [Google Scholar]
  112. 112. 
    Liao KP, Sun J, Cai TA, Link N, Hong C et al. 2019. High-throughput multimodal automated phenotyping (MAP) with application to PheWAS. J. Am. Med. Inform. Assoc 26:111255–62
    [Google Scholar]
  113. 113. 
    Zhang Y, Cai T, Yu S, Cho K, Hong C et al. 2019. High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP). Nat. Protoc. 14:123426–44
    [Google Scholar]
/content/journals/10.1146/annurev-biodatasci-122320-112352
Loading
/content/journals/10.1146/annurev-biodatasci-122320-112352
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error