Logo des Repositoriums
 

An application of latent topic document analysis to large-scale proteomics databases

dc.contributor.authorKlie, Sebastian
dc.contributor.authorMartens, Lennart
dc.contributor.authorVizcaino, Juan Antonio
dc.contributor.authorCote, Richard
dc.contributor.authorJones, Phil
dc.contributor.authorApweiler, Rolf
dc.contributor.authorHinneburg, Alexander
dc.contributor.authorHermjakob, Henning
dc.contributor.editorFalter, Claudia
dc.contributor.editorSchliep, Alexander
dc.contributor.editorSelbig, Joachim
dc.contributor.editorVingron, Martin
dc.contributor.editorWalther, Dirk
dc.date.accessioned2019-05-15T08:32:30Z
dc.date.available2019-05-15T08:32:30Z
dc.date.issued2007
dc.description.abstractSince the advent of public data repositories for proteomics data, readily accessible results from high-throughput experiments have been accumulating steadily. Several large-scale projects in particular have contributed substantially to the amount of identifications available to the community. Despite the considerable body of information amassed, very few successful analysis have been performed and published on this data, levelling off the ultimate value of these projects far below their potential. In order to illustrate that these repositories should be considered sources of detailed knowledge instead of data graveyards, we here present a novel way of analyzing the information contained in proteomics experiments with a ’latent semantic analysis’. We apply this information retrieval approach to the peptide identification data contributed by the Plasma Proteome Project. Interestingly, this analysis is able to overcome the fundamental difficulties of analyzing such divergent and heterogeneous data emerging from large scale proteomics studies employing a vast spec- trum of different sample treatment and mass-spectrometry technologies. Moreover, it yields several concrete recommendations for optimizing pro- teomics project planning as well as the choice of technologies used in the experiments. It is clear from these results that the analysis of large bodies of publicly available proteomics data holds great promise and is currently underexploited.en
dc.identifier.isbn978-3-88579-209-3
dc.identifier.pissn1617-5468
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/22366
dc.language.isoen
dc.publisherGesellschaft für Informatik e. V.
dc.relation.ispartofGerman conference on bioinformatics – GCB 2007
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-115
dc.titleAn application of latent topic document analysis to large-scale proteomics databasesen
dc.typeText/Conference Paper
gi.citation.endPage184
gi.citation.publisherPlaceBonn
gi.citation.startPage174
gi.conference.dateSeptember 26-28, 2007, Potsdam,
gi.conference.locationPotsdam
gi.conference.sessiontitleRegular Research Papers

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
174.pdf
Größe:
323.16 KB
Format:
Adobe Portable Document Format