Auflistung nach Autor:in "Rarey, Matthias"
1 - 10 von 23
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragComposite module analyst: A fitness-based tool for prediction of transcription regulation(German Conference on Bioinformatics 2005 (GCB 2005), 2005) Kel, Alexander; Konovalova, Tatiana; Waleev, Tagir; Cheremushkin, Evgeny; Kel-Margoulis, Olga; Wingender, EdgarFunctionally related genes involved in the same molecular-genetic, biochemical, or physiological process are often regulated coordinately Such regulation is provided by precisely organized binding of a multiplicity of special proteins (transcription factors) to their target sites (cis-elements) in regulatory regions of genes. Cis-element combinations provide a structural basis for the generation of unique patterns of gene expression. Here we present a new approach for defining promoter models based on composition of transcription factor binding sites and their pairs. We utilize a multicomponent fitness function for selection of that promoter model fitting best to the observed gene expression profile. We demonstrate examples of successful application of the fitness function with the help of a genetic algorithm for the analysis of functionally related or co-expressed genes as well as testing on simulated data.
- KonferenzbeitragData processing effects on the interpretation of microarray gene expression experiments(German Conference on Bioinformatics 2005 (GCB 2005), 2005) Fundel, Katrin; Küffner, Robert; Aigner, Thomas; Zimmer, RalfMotivation: Microarray gene expression data is collected at an increasing pace and numerous methods and tools exist for analyzing this kind of data. The aim of this study is to evaluate the effect of the basic statistical processing steps of microarray data on the final outcome for gene expression analysis; these effects are most problematic for one-channel cDNA measurements, but also affect other types of microarrays, especially when dealing with grouped samples. It is crucial to determine an appropriate combination of individual processing steps for a given dataset in order to improve the validity and reliability of expression data analysis. Results: We analyzed a large gene expression data set obtained from a one-channel cDNA microarray experiment conducted on 83 human samples that have been classified into four Osteoarthritis related groups. We compared different normalization methods regarding the effect on the identification of differentially expressed genes. Furthermore, we compared different methods for combining spot p-values into gene p-values, and propose Stouffer's method for this purpose. We developed several quality and robustness measures which allow to estimate the amount of errors made in the statistical data preparation. Conclusion: The apparently straight forward steps of gene expression data analysis, i.e. normalization and identification of differentially expressed genes, can be accomplished by numerous different methods. We analyzed multiple combinations of a number of methods to demonstrate the possible effects and therefore the importance of the single decisions taken during data processing. An overview of these effects is essential for the biological interpretation of gene expression measurements. We give guidelines and tools for evaluating methods for normalization, spot combination and detection of differentially regulated genes.
- KonferenzbeitragEfficient mapping of large cDNA/EST databases to genomes: A comparison of two different strategies(German Conference on Bioinformatics 2005 (GCB 2005), 2005) Wawra, Christian; Abouelhoda, Mohamed I.; Ohlebusch, EnnoThis paper presents a comparison of two strategies for cDNA/EST mapping: The seed-and-extend strategy and the fragment-chaining strategy. We derive theoretical results on the statistics of fragments of type maximal exact match. Moreover, we present efficient fragment-chaining algorithms that are simpler than previous ones. In experiments, we compared our implementation of the fragment-chaining strategy with the seed-and-extend strategy implemented in the software tool BLAT.
- KonferenzbeitragExploiting scale-free information from expression data for cancer classification(German Conference on Bioinformatics 2005 (GCB 2005), 2005) Antonov, Alexey V.; Tetko, Igor V.; Kosykh, Denis; Surmeli, Dmitrij; Mewes, Hans-WernerIn most studies concerning expression data analyses information on the variability of gene intensity across samples is usually exploited. This information is sensitive to initial data processing which affects the final conclusions. However expression data contains scale free information which is directly comparable between different samples. We propose to use the pairwise ratio of gene expression values rather than their absolute intensities for classification of expression data. This information is stable to data processing and thus more attractive for classification analyses. In proposed schema of data analyses only information on relative gene expression levels in each sample is exploited. Testing on publicly available datasets leads to superior classification results.
- KonferenzbeitragFamily specific rates of protein evolution(German Conference on Bioinformatics 2005 (GCB 2005), 2005) Luz, Hannes; Vingron, MartinAmino acid changing mutations in proteins are contstrained by purifying selection and accumulate at different rates. We estimate evolutionary rates on multiple alignments of eukaryotic protein families in a maximum likelihood framework. We find that the evolution of indispensable proteins is constrained by selection and that protein secretion is coupled to an increased evolutionary rate.
- KonferenzbeitragFrom greedy to branch & bound and back: assessing optimization strategies for incremental construction molecular docking tools(German Conference on Bioinformatics 2005 (GCB 2005), 2005) Griewel, Axel; Rarey, MatthiasA branch & bound approach for the assembly-phase of the incremental construction algorithm of the software package FlexX is presented. For this a local bound for partial solutions has been implemented which estimates the best score achievable for the considered solutions. This estimation is based on scoring values which single components can achieve in certain regions of the active site as well as distance constraints deduced from the composition of the considered partial solution. Furthermore, a timeand space-bounded search strategy specific to the addressed problem has been developed. The implemented algorithm was tested on a dataset containing 169 complexes. In 116 of these cases the calculation was finished in a reasonably defined time frame while the calculation for the remaining complexes was not completed in this period of time. For all calculated complexes, the best solution shows a score better or equal to the solution of standard-FlexX. This, however, is not always associated with an improvement of the RMSD between the calculated placement of the ligand and the crystal structure. The presented algorithm is applicable for thorough virtual screening of small sets of ligands comprising up to nine rotatable, acyclic bonds. Furthermore, the method gives important insights to the k-greedy method and can be used for scientific assessment of new scoring functions within FlexX.
- KonferenzbeitragGeneration of 3D templates of active sites of proteins with rigid prosthetic groups(German Conference on Bioinformatics 2005 (GCB 2005), 2005) Nebel, Jean-ChristopheWith the increasing availability of protein structures, the generation of biologically meaningful 3D patterns from their simultaneous alignment is an exciting prospect: active sites could be better understood, protein functions and structures could be predicted more accurately. Although patterns can be generated at the fold and topological levels, no system produces high resolution 3D patterns including atom and cavity positions. Here, we present a new approach allowing the generation of 3D patterns from alignment of proteins with rigid prosthetic groups. Using 237 proteins representing these proteins, our method was validated by comparing 3D templates generated from homologues with structures of the proteins they model. Atom positions were predicted reliably: 93% of them had an accuracy below 1.00 Å. Similar results were obtained regarding chemical group and cavity positions. Finally, a 3D template was generated for the active site of human cytochrome P450 CYP17. Its analysis showed it is biologically meaningful: our method detected the main patterns and motifs of the P450 superfamily. The 3D template also suggested the locations of a cavity and of a hydrogen bond between CYP17 and its substrates. Comparisons with independently generated 3D models comforted these hypotheses.
- KonferenzbeitragGeometric problems and algorithms in computer-aided molecular design(Informatik bewegt: Informatik 2002 - 32. Jahrestagung der Gesellschaft für Informatik e.v. (GI), Ergänzungsband, 2002) Rarey, MatthiasComputer-Aided Molecular Design describes a research area covering all kinds of computer applications in the design of molecules having desired properties. It is widely applied for the design of bioactive molecules like pharmaceuticals or agricultural products. From the computer scientists perspective, computer-aided molecular design contains a large variety of computational problems, often geometric and/or combinatorial in nature. In the introductory part of this talk, a short overview of these problems is given. The main focus in this talk will be on protein-ligand docking. The aim of a docking calculation is to predict, whether two molecules bind to each other (i.e. form an energetically favorable complex). If they do so, one is interested in the geometry of the molecular complex and in the binding energy. The most prominent variant is protein-ligand docking. Here the first molecule is a protein while the second molecule is a small compound. Since most drug targets are proteins and most drugs are small compounds, software for protein-ligand docking is extensively used in pharmaceutical research. Since 1993, we are developing the software package FlexX which belongs to the most widely used codes for protein-ligand docking. An outline of the underlying models, the applied algorithms as well as some examples will be presented.
- Editiertes Buch
- KonferenzbeitragInferring regulatory systems with noisy pathway information(German Conference on Bioinformatics 2005 (GCB 2005), 2005) Spieth, Christian; Streichert, Felix; Speern, Nora; Zell, AndreasWith increasing number of pathways available in public databases, the process of inferring gene regulatory networks becomes more and more feasible. The major problem of most of these pathways is that they are very often faulty or describe only parts of a regulatory system due to limitations of the experimental techniques or due to a focus specifically only on a subnetwork of a larger process. To address this issue, we propose a new multi-objective evolutionary algorithm in this paper, which infers gene regulatory systems from experimental microarray data by incorporating known pathways from publicly available databases. These pathways are used as an initial template for creating suitable models of the regulatory network and are then refined by the algorithm. With this approach, we were able to infer regulatory systems with incorporation of pathway information that is incomplete or even faulty.
- «
- 1 (current)
- 2
- 3
- »