The complexity of the human exposome—the totality of environmental exposures encountered from birth to death—motivates systematic, high-throughput approaches to discover new environmental determinants of disease. In this review, we describe the state of science in analyzing the human exposome and provide recommendations for the public health community to consider in dealing with analytic challenges of exposome-based biomedical research. We describe extant and novel analytic methods needed to associate the exposome with critical health outcomes and contextualize the data-centered challenges by drawing parallels to other research endeavors such as human genomics research. We discuss efforts for training scientists who can bridge public health, genomics, and biomedicine in informatics and statistics. If an exposome data ecosystem is brought to fruition, it will likely play a role as central as genomic science has had in molding the current and new generations of biomedical researchers, computational scientists, and public health research programs.


Article metrics loading...

Loading full text...

Full text loading...


Literature Cited

  1. Benjamini Y, Hochberg Y. 1.  1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57:1289–300 [Google Scholar]
  2. Billionnet C, Sherrill D, Annesi-Maesano I. 2.  2012. Estimating the health effects of exposure to multi-pollutant mixture. Ann. Epidemiol. 22:2126–41 [Google Scholar]
  3. Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P. 3.  et al. 2001. Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat. Genet. 29:4365–71 [Google Scholar]
  4. Buck Louis GM, Sundaram R. 4.  2012. Exposome: time for transformative research. Stat. Med. 31:222569–75 [Google Scholar]
  5. Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM. 5.  2006. Measurement Error in Nonlinear Models Boca Raton, FL: Chapman & Hall/CRC, 2nd ed..
  6. Cortessis V, Thomas DC. 6.  2004. Toxicokinetic genetics: an approach to gene-environment and gene-gene interactions in complex metabolic pathways. IARC Sci. Publ.157127–50
  7. Danaei G, Ding EL, Mozaffarian D, Taylor B, Rehm J. 7.  et al. 2009. The preventable causes of death in the United States: comparative risk assessment of dietary, lifestyle, and metabolic risk factors. PLOS Med 6:4e1000058 [Google Scholar]
  8. David LA, Maurice CF, Carmody RN, Gootenberg DB, Button JE. 8.  et al. 2014. Diet rapidly and reproducibly alters the human gut microbiome. Nature 505:559–63 [Google Scholar]
  9. Davis AP, Grondin CJ, Lennon-Hopkins K, Saraceni-Richards C, Sciaky D. 9.  et al. 2015. The comparative toxicogenomics database's 10th year anniversary: update 2015. Nucleic Acids Res 43:Database issueD914–20 [Google Scholar]
  10. Denny JC, Ritchie MD, Basford MA, Pulley JM, Bastarache L. 10.  et al. 2010. PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics 26:91205–10 [Google Scholar]
  11. Di Q, Rowland S, Koutrakis P, Schwartz J. 11.  2017. A hybrid model for spatially and temporally resolved ozone exposures in the continental United States. J. Air Waste Manag. Assoc. 6739–52
  12. Dominici F, Peng RD, Barr CD, Bell ML. 12.  2010. Protecting human health from air pollution: shifting from a single-pollutant to a multipollutant approach. Epidemiology 21:2187–94 [Google Scholar]
  13. Friedman J, Hastie T, Tibshirani R. 13.  2007. Sparse inverse covariance estimation with the lasso. Biostatistics 9:432–41 [Google Scholar]
  14. Gauderman WJ, Zhang P, Morrison JL, Lewinger JP. 14.  2013. Finding novel genes by testing G×E interactions in a genome-wide association study. Genet. Epidemiol. 37:6603–13 [Google Scholar]
  15. Georgopoulos PG, Lioy PJ. 15.  2006. From a theoretical framework of human exposure and dose assessment to computational system implementation: the Modeling ENvironment for TOtal Risk Studies (MENTOR). J. Toxicol. Environ. Health. B Crit. Rev. 9:6457–83 [Google Scholar]
  16. Gibbs RA, Belmont JW, Hardenbol P, Willis TD, Yu F. 16.  et al. 2003. The International HapMap Project. Nature 426:789–96 [Google Scholar]
  17. Hall MA, Dudek SM, Goodloe R, Crawford DC, Pendergrass SA. 17.  et al. 2014. Environment-wide association study (EWAS) for type 2 diabetes in the Marshfield Personalized Medicine Research Project Biobank. Pac. Symp. Biocomput. 2014:200–11 [Google Scholar]
  18. Hall MA, Verma A, Brown-Gentry KD, Goodloe R, Boston J. 18.  et al. 2014. Detection of pleiotropy through a phenome-wide association study (pheWAS) of epidemiologic data as part of the Environmental Architecture for Genes Linked to Environment (EAGLE) study. PLOS Genet 10:12e1004678 [Google Scholar]
  19. Hamilton CM, Strader LC, Pratt JG, Maiese D, Hendershot T. 19.  et al. 2011. The PhenX toolkit: Get the most from your measures. Am. J. Epidemiol. 174:3253–60 [Google Scholar]
  20. Horridge M, Tudorache T, Nuylas C, Vendetti J, Noy NF, Musen MA. 20.  2014. WebProtégé: a collaborative Web-based platform for editing biomedical ontologies. Bioinformatics 30:162384–85 [Google Scholar]
  21. Hsu L, Jiao S, Dai JY, Hutter C, Peters U, Kooperberg C. 21.  2012. Powerful cocktail methods for detecting genome-wide gene-environment interaction. Genet. Epidemiol. 36:3183–94 [Google Scholar]
  22. Ioannidis JP, Ntzani EE, Trikalinos TA, Contopoulos-Ioannidis DG. 22.  2001. Replication validity of genetic association studies. Nat. Genet. 29:3306–9 [Google Scholar]
  23. Ioannidis JPA, Loy EY, Poulton R, Chia KS. 23.  2009. Researching genetic versus nongenetic determinants of disease: a comparison and proposed unification. Sci. Transl. Med. 1:77ps8 [Google Scholar]
  24. Jongeneelen FJ, Ten Berge WF. 24.  2011. A generic, cross-chemical predictive PBTK model with multiple entry routes running as application in MS Excel; design of the model and comparison of predictions with experimental results. Ann. Occup. Hyg. 55:8841–64 [Google Scholar]
  25. Kraft P, Zeggini E, Ioannidis JPA. 25.  2009. Replication in genome-wide association studies. Stat. Sci. 24:4561–73 [Google Scholar]
  26. Lawson AB, Banerjee S, Haining RP, Ugarte MD. 26.  2016. Handbook of Spatial Epidemiology Boca Raton, FL: Chapman & Hall/CRC
  27. Li D, Conti DV. 27.  2009. Detecting gene-environment interactions using a combined case-only and case-control approach. Am. J. Epidemiol. 169:4497–504 [Google Scholar]
  28. Lioy PJ, Rappaport SM. 28.  2011. Exposure science and the exposome: an opportunity for coherence in the environmental health sciences. Environ. Health Perspect. 119:11a466–67 [Google Scholar]
  29. Mailman MD, Feolo M, Jin Y, Kimura M, Tryka K. 29.  et al. 2007. The NCBI dbGaP database of genotypes and phenotypes. Nat. Genet. 39:101181–86 [Google Scholar]
  30. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA. 30.  et al. 2009. Finding the missing heritability of complex diseases. Nature 461:747–53 [Google Scholar]
  31. Masys DR, Harris PA, Fearn PA, Kohane IS. 31.  2012. Designing a public square for research computing. Sci. Transl. Med. 4:149149fs32 [Google Scholar]
  32. Mattingly CJ, McKone TE, Callahan MA, Blake JA, Cohen Hubal EA. 32.  2012. Providing the missing link: the exposure science ontology ExO. Environ. Sci. Technol. 46:63046–53 [Google Scholar]
  33. Miller GW, Jones DP. 33.  2014. The nature of nurture: refining the definition of the exposome. Toxicol. Sci. 137:11–2 [Google Scholar]
  34. Mokdad AH, Marks JS, Stroup DF, Gerberding JL. 34.  2004. Actual causes of death in the United States, 2000. JAMA 291:101238–45 [Google Scholar]
  35. Mukherjee B, Ahn J, Gruber SB, Chatterjee N. 35.  2012. Testing gene-environment interaction in large-scale case-control association studies: possible choices and comparisons. Am. J. Epidemiol. 175:3177–90 [Google Scholar]
  36. Nagashima K, Sato Y, Noma H, Hamada C. 36.  2013. An efficient and robust method for analyzing population pharmacokinetic data in genome-wide pharmacogenomic studies: a generalized estimating equation approach. Stat. Med. 32:274838–58 [Google Scholar]
  37. Noy NF, Shah NH, Whetzel PL, Dai B, Dorf M. 37.  et al. 2009. BioPortal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Res 37:Suppl. 2W170–73 [Google Scholar]
  38. Patel CJ, Bhattacharya J, Butte AJ. 38.  2010. An environment-wide association study (EWAS) on type 2 diabetes mellitus. PLOS ONE 5:5e10746 [Google Scholar]
  39. Patel CJ, Chen R, Kodama K, Ioannidis JPA, Butte AJ. 39.  2013. Systematic identification of interaction effects between genome- and environment-wide associations in type 2 diabetes mellitus. Hum. Genet. 132:5495–508 [Google Scholar]
  40. Patel CJ, Cullen MR, Ioannidis JPA, Butte AJ. 40.  2012. Systematic evaluation of environmental factors: persistent pollutants and nutrients correlated with serum lipid levels. Int. J. Epidemiol. 41:3828–43 [Google Scholar]
  41. Patel CJ, Ioannidis JPA. 41.  2014. Placing epidemiological results in the context of multiplicity and typical correlations of exposures. J. Epidemiol. Community Health 68:111096–100 [Google Scholar]
  42. Patel CJ, Ioannidis JPA. 42.  2014. Studying the elusive environment in large scale. JAMA 311:212173–74 [Google Scholar]
  43. Patel CJ, Manrai AK. 43.  2015. Development of exposome correlation globes to map out environment-wide associations. Pac. Symp. Biocomput. 2015:231–42 [Google Scholar]
  44. Patel CJ, Pho N, McDuffie M, Easton-Marks J, Kothari C. 44.  et al. 2016. A database of human exposomes and phenomes from the US National Health and Nutrition Examination Survey. Sci. Data 3:160096 [Google Scholar]
  45. Peyret T, Krishnan K. 45.  2011. QSARs for PBPK modelling of environmental contaminants. SAR QSAR Environ. Res. 22:1–2129–69 [Google Scholar]
  46. Polderman TJC, Benyamin B, de Leeuw CA, Sullivan PF, van Bochoven A. 46.  et al. 2015. Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nat. Genet. 47:7702–9 [Google Scholar]
  47. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. 47.  2006. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38:8904–9 [Google Scholar]
  48. Price K, Krishnan K. 48.  2011. An integrated QSAR-PBPK modelling approach for predicting the inhalation toxicokinetics of mixtures of volatile organic chemicals in the rat. SAR QSAR Environ. Res. 22:1–2107–28 [Google Scholar]
  49. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR. 49.  et al. 2007. Plink: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81:3559–75 [Google Scholar]
  50. Rappaport SM. 50.  2016. Genetic factors are not the major causes of chronic diseases. PLOS ONE 11:4e0154387 [Google Scholar]
  51. Rappaport SM, Barupal DK, Wishart D, Vineis P, Scalbert A. 51.  2014. The blood exposome and its role in discovering causes of disease. Environ. Health Perspect. 122:8769–74 [Google Scholar]
  52. Rappaport SM, Smith MT. 52.  2010. Environment and disease risks. Science 330:460–61 [Google Scholar]
  53. Sarigiannis D, Gotti A, Karakitsios S. 53.  2011. A computational framework for aggregate and cumulative exposure assessment. Epidemiology 22S96–97
  54. Sarigiannis D, Gotti A, Reale GC, Marafante E. 54.  2009. Reflections on new directions for risk assessment of environmental chemical mixtures. Int. J. Risk Assess. Manag. 13:216 [Google Scholar]
  55. Smith B, Ashburner M, Rosse C, Bard J, Bug W. 55.  et al. 2007. The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol. 25:111251–55 [Google Scholar]
  56. 56. SNPedia 2014. Heritability. SNPedia updated July 12. http://www.snpedia.com/index.php/Heritability
  57. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL. 57.  et al. 2005. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. PNAS 102:4315545–50 [Google Scholar]
  58. Sun Z, Tao Y, Li S, Ferguson KK, Meeker JD. 58.  et al. 2013. Statistical strategies for constructing health risk models with multiple pollutants and their interactions: possible choices and comparisons. Environ. Health 12:185 [Google Scholar]
  59. Thomas D. 59.  2010. Gene–environment-wide association studies: emerging approaches. Nat. Rev. Genet. 11:4259–72 [Google Scholar]
  60. Thomas DC. 60.  1988. Models for exposure-time-response relationships with applications to cancer epidemiology. Annu. Rev. Public Health 9:451–82 [Google Scholar]
  61. Turner M, Nieuwenhuijsen M, Anderson K, Balshaw D, Cui Y. 61.  et al.2017 Assessing the exposome with external measures: commentary on the state of the science and research recommendations. Annu. Rev. Public Health 38215–39
  62. Tzoulaki I, Patel CJ, Okamura T, Chan Q, Brown IJ. 62.  et al. 2012. A nutrient-wide association study on blood pressure. Circulation 126:212456–64 [Google Scholar]
  63. 63. US EPA (US Environ. Prot. Agency) 2014. Toxic Substances Control Act (TSCA) search Updated Sept. 22, US EPA Washington, DC.: http://www.epa.gov/enviro/facts/tsca/tsca_search.html
  64. Visscher PM, Brown MA, McCarthy MI, Yang J. 64.  2012. Five years of GWAS discovery. Am. J. Hum. Genet. 90:7–24 [Google Scholar]
  65. Vrijheid M, Slama R, Robinson O, Chatzi L, Coen M. 65.  et al. 2014. The human early-life exposome (HELIX): project rationale and design. Environ. Health Perspect. 122:6535–44 [Google Scholar]
  66. Wakefield J. 66.  1996. The Bayesian analysis of population pharmacokinetic models. J. Am. Stat. Assoc. 91:62–75 [Google Scholar]
  67. Welter D, MacArthur J, Morales J, Burdett T, Hall P. 67.  et al. 2014. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res 42:D1D1001–6 [Google Scholar]
  68. Wild CP. 68.  2005. Complementing the genome with an “exposome”: the outstanding challenge of environmental exposure measurement in molecular epidemiology. Cancer Epidemiol. Biomarkers Prev. 14:81847–50 [Google Scholar]
  69. Wild CP. 69.  2012. The exposome: from concept to utility. Int. J. Epidemiol. 41:124–32 [Google Scholar]
  70. Willmann S, Lippert J, Sevestre M, Solodenko J, Fois F, Schmitt W. 70.  2003. PK-Sim®: a physiologically based pharmacokinetic “whole-body” model. BIOSILICO 1:4121–24 [Google Scholar]
  71. Wishart D, Arndt D, Pon A, Sajed T, Guo AC. 71.  et al. 2015. T3DB: the toxic exposome database. Nucleic Acids Res 43:D1D928–34 [Google Scholar]
  72. Zeger SL, Thomas D, Dominici F, Samet JM, Schwartz J. 72.  et al. 2000. Exposure measurement error in time-series studies of air pollution: concepts and consequences. Environ. Health Perspect. 108:5419–26 [Google Scholar]
  73. Zou H, Hastie T. 73.  2005. Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67:2301–20 [Google Scholar]

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error