Logo des Repositoriums
 
Konferenzbeitrag

An application of latent topic document analysis to large-scale proteomics databases

Lade...
Vorschaubild

Volltext URI

Dokumententyp

Text/Conference Paper

Zusatzinformation

Datum

2007

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Gesellschaft für Informatik e. V.

Zusammenfassung

Since the advent of public data repositories for proteomics data, readily accessible results from high-throughput experiments have been accumulating steadily. Several large-scale projects in particular have contributed substantially to the amount of identifications available to the community. Despite the considerable body of information amassed, very few successful analysis have been performed and published on this data, levelling off the ultimate value of these projects far below their potential. In order to illustrate that these repositories should be considered sources of detailed knowledge instead of data graveyards, we here present a novel way of analyzing the information contained in proteomics experiments with a ’latent semantic analysis’. We apply this information retrieval approach to the peptide identification data contributed by the Plasma Proteome Project. Interestingly, this analysis is able to overcome the fundamental difficulties of analyzing such divergent and heterogeneous data emerging from large scale proteomics studies employing a vast spec- trum of different sample treatment and mass-spectrometry technologies. Moreover, it yields several concrete recommendations for optimizing pro- teomics project planning as well as the choice of technologies used in the experiments. It is clear from these results that the analysis of large bodies of publicly available proteomics data holds great promise and is currently underexploited.

Beschreibung

Klie, Sebastian; Martens, Lennart; Vizcaino, Juan Antonio; Cote, Richard; Jones, Phil; Apweiler, Rolf; Hinneburg, Alexander; Hermjakob, Henning (2007): An application of latent topic document analysis to large-scale proteomics databases. German conference on bioinformatics – GCB 2007. Bonn: Gesellschaft für Informatik e. V.. PISSN: 1617-5468. ISBN: 978-3-88579-209-3. pp. 174-184. Regular Research Papers. Potsdam. September 26-28, 2007, Potsdam,

Schlagwörter

Zitierform

DOI

Tags