Articles
Dirichlet multinomial mixtures: generative models for microbial metagenomics
Export record to excel
File type to download
Record title
Dirichlet multinomial mixtures: generative models for microbial metagenomics
Record identifier
TN_cdi_plos_journals_1323434142
Record permalink
https://collection.sl.nsw.gov.au/record/TN_cdi_plos_journals_1323434142
Share
Dirichlet multinomial mixtures: generative models for microbial metagenomics
- Online access available on login with a Library card.
Dirichlet multinomial mixtures: generative models for microbial metagenomics
Full title
Author / Creator
Publisher
United States: Public Library of Science
Journal title
Record Identifier
Language
English
Formats
Publication information
Publisher
United States: Public Library of Science
Subjects
More information
— SCOPE AND CONTENTS
Contents
We introduce Dirichlet multinomial mixtures (DMM) for the probabilistic modelling of microbial metagenomics data. This data can be represented as a frequency matrix giving the number of times each taxa is observed in each sample. The samples have different size, and the matrix is sparse, as communities are diverse and skewed to rare taxa. Most methods used previously to classify or cluster samples have ignored these features. We describe each community by a vector of taxa probabilities. These vectors are generated from one of a finite number of Dirichlet mixture components each with different hyperparameters. Observed samples are generated through multinomial sampling. The mixture components cluster communities into distinct 'metacommunities', and, hence, determine envirotypes or enterotypes, groups of communities with a similar composition. The model can also deduce the impact of a treatment and be used for classification. We wrote software for the fitting of DMM models using the 'evidence framework' (http://code.google.com/p/microbedmm/). This includes the Laplace approximation of the model evidence. We applied the DMM model to human gut microbe genera frequencies from Obese and Lean twins. From the model evidence four clusters fit this data best. Two clusters were dominated by Bacteroides and were homogenous; two had a more variable community composition. We could not find a...
— ALTERNATIVE TITLES
Full title
Dirichlet multinomial mixtures: generative models for microbial metagenomics
— AUTHORS, ARTISTS AND CONTRIBUTORS
Author / Creator
Identifiers
— PRIMARY IDENTIFIERS
Record Identifier
TN_cdi_plos_journals_1323434142
Permalink
https://collection.sl.nsw.gov.au/record/TN_cdi_plos_journals_1323434142
— OTHER IDENTIFIERS
ISSN
1932-6203
E-ISSN
1932-6203
DOI
10.1371/journal.pone.0030126