P157 - GCB 2009 - German Conference on Bioinformatics 2009
Auflistung P157 - GCB 2009 - German Conference on Bioinformatics 2009 nach Erscheinungsdatum
1 - 10 von 20
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragMaximum likelihood estimation of weight matrices for targeted homology search(German conference on bioinformatics 2009, 2009) Menzel, Peter; Gorodkin, Jan; Stadler, Peter F.Genome annotation relies to a large extent on the recognition of homologs to already known genes. The starting point for such protocols is a collection of known sequences from one or more species, from which a model is constructed – either automatically or manually – that encodes the defining features of a single gene or a gene family. The quality of these models eventually determines the success rate of the homology search. We propose here a novel approach to model construction that not only captures the characteristic motifs of a gene, but are also adjusts the search pattern by including phylogenetic information. Computational tests demonstrate that this can lead to a substantial improvement of homology search models.
- Editiertes Buch
- KonferenzbeitragDiscovering temporal patterns of differential gene expression in microarray time series(German conference on bioinformatics 2009, 2009) Stegle, Oliver; Denby, Katherine J.; McHattie, Stuart; Mead, Andrew; Wild, David L.; Ghahramani, Zoubin; Borgwardt, Karsten M.A wealth of time series of microarray measurements have become available over recent years. Several two-sample tests for detecting differential gene expression in these time series have been defined, but they can only answer the question whether a gene is differentially expressed across the whole time series, not in which intervals it is differentially expressed. In this article, we propose a Gaussian process based approach for studying these dynamics of differential gene expression. In experiments on Arabidopsis thaliana gene expression levels, our novel technique helps us to uncover that the family of WRKY transcription factors appears to be involved in the early response to infection by a fungal pathogen.
- KonferenzbeitragAligning protein structures using distance matrices and combinatorial optimization(German conference on bioinformatics 2009, 2009) Wohlers, Inken; Petzold, Lars; Domingues, Francisco S.; Klau, Gunnar W.Structural alignments of proteins are used to identify structural similarities. These similarities can indicate homology or a common or similar function. Several, mostly heuristic methods are available to compute structural alignments. In this paper, we present a novel algorithm that uses methods from combinatorial optimization to compute provably optimal structural alignments of sparse protein distance matrices. Our algorithm extends an elegant integer linear programming approach proposed by Caprara et al. for the alignment of protein contact maps. We consider two different types of distance matrices with distances either between Cα atoms or between the two closest atoms of each residue. Via a comprehensive parameter optimization on HOMSTRAD alignments, we determine a scoring function for aligned pairs of distances. We introduce a negative score for non-structural, purely sequence-based parts of the alignment as a means to adjust the locality of the resulting structural alignments. Our approach is implemented in a freely available software tool named PAUL (Protein structural Alignment Using Lagrangian relaxation). On the challenging SISY data set of 130 reference alignments we compare PAUL to six state-of-the-art structural alignment algorithms, DALI, MATRAS, FATCAT, SHEBA, CA, and CE. Here, PAUL reaches the highest average and median alignment accuracies of all methods and is the most accurate method for more than 30% of the alignments. PAUL is thus a competitive tool for pairwise high-quality structural alignment.
- KonferenzbeitragExpression profiles of metabolic models to predict compartmentation of enzymes in multi-compartmental systems(German conference on bioinformatics 2009, 2009) Chokkathukalam, Achuthanunni; Poolman, Marc; Ferrazzi, Chiara; Fell, DavidEnzymes and other proteins coded by nuclear genes are targeted towards various compartments in the plant cell. Here, we describe a method by which localisation of enzymes in a plant cell may be predicted based on their transcription profile in conjunction with analysis of the structure of the metabolic network. This method uses reaction correlation coefficients to identify reactions in a metabolic model that carry similar flux. First a correlation matrix for the expression of genes of interest is calculated and the columns clustered hierarchically using the correlation coefficient. The rows clustered using reaction correlation coefficients. In the resulting matrix, we show that the genes in a particular compartment are clustered together and compartmental predictions, with respect to a reference gene can be readily made.
- KonferenzbeitragTraceable analysis of multiple-stage mass spectra through precursor-product annotations(German conference on bioinformatics 2009, 2009) Horai, Hisayuki; Arita, Masanori; Ojima, Yuya; Nihei, Yoshito; Kanaya, Shigehiko; Nishioka, TakaakiWe present a wiki-based interface for multiple-stage mass spectra with molecular structures and their physicochemical properties. Spectra for 453 metabolites were measured on QqTOF-MSn and their ion peaks were annotated with consideration of fragmentation patterns, especially bond cleavages. The resulting information was classified on wiki pages, where related molecular formulas and their relationships were likewise accumulated. Each page is rendered with search operation(s) using formulas as keys, and related information is automatically updated as database (http://massbank.jp/), is available at http://metabolomics.jp/wiki/Index:MassBank.
- Konferenzbeitrag2D projections of RNA folding landscapes(German conference on bioinformatics 2009, 2009) Lorenz, Rony; Flamm, Christoph; Hofacker, Ivo L.The analysis of RNA folding landscapes yields insights into the kinetic folding behavior not available from classical structure prediction methods. This is especially important for multi-stable RNAs whose function is related to structural changes, as in the case of riboswitches. However, exact methods such as barrier tree analysis scale exponentially with sequence length. Here we present an algorithm that computes a projection of the energy landscape into two dimensions, namely the distances to two reference structures. This yields an abstraction of the high-dimensional energy landscape that can be conveniently visualized, and can serve as the basis for estimating energy barriers and refolding pathways. With an asymptotic time complexity of O(n7) the algorithm is computationally demanding. However, by exploiting the sparsity of the dynamic programming matrices and parallelization for multi-core processors, our implementation is practical for sequences of up to 400 nt, which includes most RNAs of biological interest.
- KonferenzbeitragConverting DNA to music: COMPOSALIGN(German conference on bioinformatics 2009, 2009) Ingalls, Todd; Martius, Georg; Hellmuth, Marc; Marz, Manja; Prohaska, Sonja J.Alignments are part of the most important data type in the field of comparative genomics. They can be abstracted to a character matrix derived from aligned sequences. A variety of biological questions forces the researcher to inspect these alignments. Our tool, called COMPOSALIGN, was developed to sonify large scale genomic data. The resulting musical composition is based on COMMON MUSIC and allows the mapping of genes to motifs and species to instruments. It enables the researcher to listen to the musical representation of the genome-wide alignment and contrasts a bioinformatician's sight-oriented work at the computer.
- KonferenzbeitragSemi-supervised learning for improving prediction of HIV drug resistance(German conference on bioinformatics 2009, 2009) Perner, Juliane; Altmann, André.; Lengauer, ThomasResistance testing is an important tool in today's anti-HIV therapy management for improving the success of antiretroviral therapy. Routinely, the genetic sequence of viral target proteins is obtained. These sequences are then inspected for mutations that might confer resistance to antiretroviral drugs. However, interpretation of the genomic data is challenging. In recent years, approaches that employ supervised statistical learning methods were made available to assist the interpretation of the complex genetic information (e.g. geno2pheno and VircoTYPE). However, these methods rely on large amounts of labeled training data, which are expensive and labor-intensive to obtain. This work evaluates the application of semi-supervised learning (SSL) for improving the prediction of resistance from the viral genome.
- KonferenzbeitragCUDA-based multi-core implementation of MDS-based bioinformatics algorithms(German conference on bioinformatics 2009, 2009) Fester, Thilo; Schreiber, Falk; Strickert, MarcSolving problems in bioinformatics often needs extensive computational power. Current trends in processor architecture, especially massive multi-core processors for graphic cards, combine a large number of cores into a single chip to improve the overall performance. The Compute Unified Device Architecture (CUDA) provides programming interfaces to make full use of the computing power of graphics processing units. We present a way to use CUDA for substantial performance improvement of methods based on multi-dimensional scaling (MDS). The suitability of the CUDA architecture as a high-performance computing platform is studied by adapting a MDS algorithm on specific hardware properties. We show how typical bioinformatics problems related to dimension reduction and network layout benefit from the multi-core implementation of the MDS algorithm. CUDA-based methods are introduced and compared to standard solutions, demonstrating 50-fold acceleration and above.