- KonferenzbeitragTanimoto’s Best Barbecue: Discovering Regulatory Modules using Tanimoto Scores(German conference on bioinformatics – GCB 2007, 2007) Menzel, Peter; Stadler, Peter F.; Mosig, Axel; Falter, Claudia; Schliep, Alexander; Selbig, Joachim; Vingron, Martin; Walther, DirkWe present a combinatorial method for discovering cis-regulatory modules in promoter sequences. Our approach combines “sliding window” approaches with a scoring function based on the so-called Tanimoto score. This allows to identify sets of binding sites that tend to occur preferentially in the vicinity of each other in a given set of promoter sequences belonging to co-expressed or orthologous genes. We bench- mark our method on a data set derived from muscle-specific genes, demonstrating that our approach is capable of identifying modules that were identified as functional in previous studies.
- KonferenzbeitragProtein structure comparison based on fold evolution(German conference on bioinformatics – GCB 2007, 2007) Kurbatova, Natalja; Mančinska, Laura; Vīksna, Juris; Falter, Claudia; Schliep, Alexander; Selbig, Joachim; Vingron, Martin; Walther, DirkThe paper presents a protein structure comparison algorithm that is capable to identify specific fold mutations between two proteins. The search for such mutations is based on structure evolution models suggesting that, similarly as sequences, protein folds (at least partially) evolve by a stepwise process, where each step comprises comparatively simple changes affecting few secondary structure elements. The particular fold mutations considered in this study are based on the work by Grishin [Gr01]. The algorithm uses structure representation by 3D graphs and is a modification of a method used in SSM structure alignment tool [KH04a]. Experiments demonstrate that our method is able automatically identify 85% of examples of fold mutations given by Grishin. Also a number of tests involving all-against-all comparisons of CATH struc- tural domains have been performed in order to measure comparative frequencies of different types of fold mutations and some statistical estimations have been obtained.
- KonferenzbeitragUnderstanding of SMFS barriers by means of energy profiles(German conference on bioinformatics – GCB 2007, 2007) Dressel, Frank; Marsico, Annalisa; Tuukkanen, Anne; Schroeder, Michael; Labudde, Dirk; Falter, Claudia; Schliep, Alexander; Selbig, Joachim; Vingron, Martin; Walther, DirkIn the last years, Single Molecule Force Spectroscopy was more and more used to gain insight into the fundamental principles behind protein structure and stability. Nevertheless, the interpretation of the experimental findings is not so easy and additional computational approaches are needed to interpret them. Here, we proposed an approach based on interaction patterns between amino acids to explain the emergence of SMFS unfolding barriers in the experiment. With our approach, we can predict around 64% of the experimentally detectable barriers.
- KonferenzbeitragIdentifying microRNAs and their targets(German conference on bioinformatics – GCB 2007, 2007) Rajewsky, Nikolaus; Falter, Claudia; Schliep, Alexander; Selbig, Joachim; Vingron, Martin; Walther, DirkI will summarize what can be learned from predicting and analyzing microRNA targets. As an example, I will discuss the function of miR-150 in the immune system. Finally, I will present a new algorithm for the identification of microRNAs from deep sequencing data.
- KonferenzbeitragRNAplex: a fast and flexible RNA-RNA interaction search tool(German conference on bioinformatics – GCB 2007, 2007) Tafer, Hakim; Hofacker, Ivo L.; Falter, Claudia; Schliep, Alexander; Selbig, Joachim; Vingron, Martin; Walther, DirkRegulatory RNAs often unfold their action via RNA-RNA interaction. Transcriptional gene silencing by means of siRNAs and miRNA as well as snoRNA directed RNA editing rely on this mechanism. ncRNA regulation in bacteria is mainly based upon RNA duplex formation. Finding putative target sites for newly discovered ncR-NAs is a lengthy task as tools for cofolding RNA molecules like RNAcofold and RNAup have a run time proportional to O((n + m)3) which makes them unpractical for whole genome search. We present a new program, RNAplex, especially designed to quickly find possible hybridization sites for a query RNA in large RNA databases. In contrast to earlier approaches, RNAplex uses a slightly different energy model which reduces the computational time by a factor 65 compared to RNAhybrid without loss of sensitivity.
- KonferenzbeitragFinding relevant biotransformation routes in weighted metabolic networks using atom mapping rules(German conference on bioinformatics – GCB 2007, 2007) Blum, Torsten; Kohlbacher, Oliver; Falter, Claudia; Schliep, Alexander; Selbig, Joachim; Vingron, Martin; Walther, DirkComputational analysis of pathways in metabolic networks has numerous applications in systems biology. While graph theory-based approaches have been pre- sented that find biotransformation routes from one metabolite to another in these net- works, most of these approaches suffer from finding too many routes, most of which are biologically infeasible or meaningless. We present a novel approach for finding relevant routes based on atom mapping rules (describing which educt atoms are mapped onto which product atoms in a chemical reaction). This leads to a reformulation of the problem as a lightest path search in a degree-weighted metabolic network. A key component of the approach is a new method of computing optimal atom mapping rules.
- KonferenzbeitragAre we overestimating the number of cell-cycling genes? The impact of background models for time series data(German conference on bioinformatics – GCB 2007, 2007) Futschik, Matthias E.; Herzel, Hanspeter; Falter, Claudia; Schliep, Alexander; Selbig, Joachim; Vingron, Martin; Walther, DirkPeriodic processes play fundamental roles in organisms. Prominent examples are the cell cycle and the circadian clock. Microarray array technology has enabled us to screen complete sets of transcripts for possible association with such fundamental periodic processes on a system-wide level. Frequently, quite a large number of genes has been detected as periodically expressed. However, the small overlap of identified genes between different studies has shaded considerable doubts about the reliability of the detected periodic expression. In this study, we show that a major reason for the lacking agreement is the use of an inadequate background model for the determination of significance. We demonstrate that the choice of background model has considerable impact on the statistical significance of periodic expression. For illustration, we reanalyzed two microarray studies of the yeast cell cycle. Our evaluation strongly indicates that the results of previous analyses might have been overoptimistic and that the use of more suitable background model promises to give more realistic results.
- Editiertes BuchGerman conference on bioinformatics – GCB 2007(2007) Falter, Claudia; Schliep, Alexander; Selbig, Joachim; Vingron, Martin; Walther, Dirk
- KonferenzbeitragA general paradigm for fast, adaptive clustering of biological sequences(German conference on bioinformatics – GCB 2007, 2007) Reinert, Knut; Bauer, Markus; Döring, Andreas; Klau, Gunnar W.; Halpern, Aaron L.; Falter, Claudia; Schliep, Alexander; Selbig, Joachim; Vingron, Martin; Walther, DirkThere are numerous methods that compute clusterings of biological sequences based on pairwise distances. This necessitates the computation of O(n2) sequence comparisons. Users usually want to apply the most sensitive distance measure which normally is the most expensive in terms of runtime. This poses a problem if the number of sequences is large or the computation of the measure is slow. In this paper we present a general heuristic to speed up distance based clustering methods considerably while compromising little on the accuracy of the results. The speedup comes from using fast comparison methods to perform an initial ‘top-down’ split into relatively homogeneous clusters, while the slower measures are used for smaller groups. Then profiles are computed for the final groups and the resulting profiles are used in a bottom-up phase to compute the final clustering. The algorithm is general in the sense that any sequence comparison method can be employed (e.g. for DNA, RNA or amino acids). We test our algorithm using a prototypical imple- mentation for agglomerative RNA clustering and show its effectiveness.
- KonferenzbeitragIntegrative analysis of transcriptome and metabolome data(German conference on bioinformatics – GCB 2007, 2007) Willmitzer, Lothar; Caldana, Camila; Fernie, Alisdair; Giavalisco, Patrick; Hannah, Matthew; Redestig, Hennig; Steinhauser, Dirk; Falter, Claudia; Schliep, Alexander; Selbig, Joachim; Vingron, Martin; Walther, DirkBiological systems have to react to environmental and/or developmental changes by adjusting their biochemical/cellular machinery on numerous levels. In many cases small molecules play a crucial role as signal molecules transmitting changes in state to receptor molecules which subsequently translate this into changes on various levels of the realization of genomic information. We here start out from the hypothesis that in most biological systems there are many more small molecules which also serve a signalling function than generally acknowledged. This hypothesis is based on three simple arguments, which present limitations in classical approaches to detect signalling molecules: - In order to identify a signalling molecule one has to be able to monitor the response of the system under study towards this signal, - In order to identify a signalling molecule one has to be able to detect and quantify this molecule, - A third important point to consider is the fact that in many cases test for signalling molecules were performed by externally applying the putative signal molecule to the system. It is obvious that further problem linked to uptake and mobility of the compound arise during this approach. We therefore set out to design an experiment where we tried to largely overcome these three limitations. As to the response of the system we decided to use the transcriptional response as a read- out. As to the second and third limitation, we decided to rather monitor endogenous changes of small molecules. To this end we applied metabolomics techniques for analysing the state of a system which at least in an ideal world should allow the quantification and identification of all small molecules present in the biological system. We here present a first set of data resulting from a large experiment where the response of a plant system (leaves of Arabidopsis thaliana) as a function of numerous environmental conditions applied has been followed in parallel on both analytic levels described above, i.e. RNA expression and metabolite profiles.