Konferenzbeitrag
Characterizing metagenomic novelty with unexplained protein domain hits
Lade...
Volltext URI
Dokumententyp
Text/Conference Paper
Dateien
Zusatzinformation
Datum
2014
Autor:innen
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Verlag
Gesellschaft für Informatik e.V.
Zusammenfassung
In metagenomics, the discovery of functional novelty has always been pursued in a gene-centered manner. In that way, sequence-based analysis has been restricted to particular features and to a sufficient length of the sequences. We propose a statistical approach that is independent from the identification of single sequences but rather yields an overall characterization of a metagenome. Our method is based on the analysis of significant differences between the functional profile of a metagenome and its reconstruction from a combination of genomic profiles using the Taxy-Pro mixture model. Here, protein families with a large proportion of domain hits that cannot be explained by the model are interesting candidates for the exploration of metagenomic novelty. The results of three case studies indicate that our method is able to characterize metagenomic novelty in terms of the protein families that significantly contribute to unexplained domain counts. We found a good correspondence between our predictions and the discoveries in the original studies as well as specific indicators of functional novelty that have not yet been described.