1932

Abstract

The human microbiome is the totality of all microbes in and on the human body, and its importance in health and disease has been increasingly recognized. High-throughput sequencing technologies have recently enabled scientists to obtain an unbiased quantification of all microbes constituting the microbiome. Often, a single sample can produce hundreds of millions of short sequencing reads. However, unique characteristics of the data produced by the new technologies, as well as the sheer magnitude of these data, make drawing valid biological inferences from microbiome studies difficult. Analysis of these big data poses great statistical and computational challenges. Important issues include normalization and quantification of relative taxa, bacterial genes, and metabolic abundances; incorporation of phylogenetic information into analysis of metagenomics data; and multivariate analysis of high-dimensional compositional data. We review existing methods, point out their limitations, and outline future research directions.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-statistics-010814-020351
2015-04-10
2025-06-13
Loading full text...

Full text loading...

/content/journals/10.1146/annurev-statistics-010814-020351
Loading
/content/journals/10.1146/annurev-statistics-010814-020351
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error