Auflistung nach Autor:in "Reinert, Knut"
1 - 2 von 2
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragA general paradigm for fast, adaptive clustering of biological sequences(German conference on bioinformatics – GCB 2007, 2007) Reinert, Knut; Bauer, Markus; Döring, Andreas; Klau, Gunnar W.; Halpern, Aaron L.There are numerous methods that compute clusterings of biological sequences based on pairwise distances. This necessitates the computation of O(n2) sequence comparisons. Users usually want to apply the most sensitive distance measure which normally is the most expensive in terms of runtime. This poses a problem if the number of sequences is large or the computation of the measure is slow. In this paper we present a general heuristic to speed up distance based clustering methods considerably while compromising little on the accuracy of the results. The speedup comes from using fast comparison methods to perform an initial ‘top-down’ split into relatively homogeneous clusters, while the slower measures are used for smaller groups. Then profiles are computed for the final groups and the resulting profiles are used in a bottom-up phase to compute the final clustering. The algorithm is general in the sense that any sequence comparison method can be employed (e.g. for DNA, RNA or amino acids). We test our algorithm using a prototypical imple- mentation for agglomerative RNA clustering and show its effectiveness.
- ZeitschriftenartikelThe Collaborative Research Center FONDA(Datenbank-Spektrum: Vol. 21, No. 3, 2021) Leser, Ulf; Hilbrich, Marcus; Draxl, Claudia; Eisert, Peter; Grunske, Lars; Hostert, Patrick; Kainmüller, Dagmar; Kao, Odej; Kehr, Birte; Kehrer, Timo; Koch, Christoph; Markl, Volker; Meyerhenke, Henning; Rabl, Tilmann; Reinefeld, Alexander; Reinert, Knut; Ritter, Kerstin; Scheuermann, Björn; Schintke, Florian; Schweikardt, Nicole; Weidlich, MatthiasToday’s scientific data analysis very often requires complex Data Analysis Workflows (DAWs) executed over distributed computational infrastructures, e.g., clusters. Much research effort is devoted to the tuning and performance optimization of specific workflows for specific clusters. However, an arguably even more important problem for accelerating research is the reduction of development, adaptation, and maintenance times of DAWs. We describe the design and setup of the Collaborative Research Center (CRC) 1404 “FONDA -– Foundations of Workflows for Large-Scale Scientific Data Analysis”, in which roughly 50 researchers jointly investigate new technologies, algorithms, and models to increase the portability, adaptability, and dependability of DAWs executed over distributed infrastructures. We describe the motivation behind our project, explain its underlying core concepts, introduce FONDA’s internal structure, and sketch our vision for the future of workflow-based scientific data analysis. We also describe some lessons learned during the “making of” a CRC in Computer Science with strong interdisciplinary components, with the aim to foster similar endeavors.