Logo des Repositoriums
 

Computational methods for small molecule identification

dc.contributor.authorDührkop, Kai
dc.date.accessioned2021-06-21T12:20:11Z
dc.date.available2021-06-21T12:20:11Z
dc.date.issued2019
dc.description.abstractIdentification of small molecules remains a central question in analytical chemistry, in particular for natural product research, metabolomics, environmental research, and biomarker discovery. Mass spectrometry is the predominant technique for high-throughput analysis of small molecules. But it reveals only information about the mass of molecules and, by using tandem mass spectrometry, about the mass of molecular fragments. Automated interpretation of mass spectra is often limited to searching in spectral libraries, such that we can only dereplicate molecules for which we have already recorded reference mass spectra. In my thesis “Computational methods for small molecule identification” we developed SIRIUS, a tool for the structural elucidation of small molecules with tandem mass spectrometry. The method first computes a hypothetical fragmentation tree using combinatorial optimization. By using a Bayesian statistical model, we can learn parameters and hyperparameters of the underlying scoring directly from data. We demonstrate that the statistical model, which was fitted on a small dataset, generalizes well across many different datasets and mass spectrometry instruments. In a second step the fragmentation tree is used to predict a molecular fingerprint using kernel support vector machines. The predicted fingerprint can be searched in a structure database to identify the molecular structure. We demonstrate that our machine learning model outperforms all other methods for this task, including its predecessor FingerID. SIRIUS is available as commandline tool and as user interface. The molecular fingerprint prediction is implemented as web service and receives over one million requests per month.en
dc.identifier.doi10.1515/itit-2019-0033
dc.identifier.pissn2196-7032
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/36663
dc.language.isoen
dc.publisherDe Gruyter
dc.relation.ispartofit - Information Technology: Vol. 61, No. 5-6
dc.subjectbioinformatics
dc.subjectmass spectrometry
dc.subjectmachine learning
dc.subjectmetabolomics
dc.titleComputational methods for small molecule identificationen
dc.typeText/Journal Article
gi.citation.endPage292
gi.citation.publisherPlaceBerlin
gi.citation.startPage285

Dateien