Auflistung nach Autor:in "Beckstette, Michael"
1 - 3 von 3
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragGenlight: An interactive system for high-throughput sequence analysis and comparative genomics(German Conference on Bioinformatics 2004, GCB 2004, 2004) Beckstette, Michael; Sczyrba, Alexander; Selzer, Paul M.
- KonferenzbeitragLightweight Comparison of RNAs Based on Exact Sequence-Structure Matches(German Conference on Bioinformatics, 2008) Heyne, Steffen; Will, Sebastian; Beckstette, Michael; Backofen, RolfSpecific functions of RNA molecules are often associated with different motifs in the RNA structure. The key feature that forms such an RNA motif is the combination of sequence and structure properties. In this paper we introduce a new RNA sequence-structure comparison method which maintains exact matching substructures. Existing common substructures are treated as whole unit while variability is allowed between such structural mo- tifs. Based on a fast detectable set of overlapping and crossing substructure matches for two nested RNA secondary structures, our method computes the longest colinear sequence of substructures common to two RNAs in O(n2m2) time and O(nm) space. Applied to different RNAs, our method correctly identifies sequence-structure similarities between two RNAs. The results of our experiments are in good agreement with existing alignment-based meth- ods, but can be obtained in a fraction of running time, in particular for larger RNAs. The proposed algorithm is implemented in the program expaRNA, which is available from our website (www.bioinf.uni-freiburg.de/Software).
- KonferenzbeitragPoSSuMsearch: Fast and sensitive matching of position specific scoring matrices using enhanced suffix arrays(German Conference on Bioinformatics 2004, GCB 2004, 2004) Beckstette, Michael; Strothmann, Dirk; Homann, Romann; Giegerich, Robert; Kurtz, StefanIn biological sequence analysis, position specific scoring matrices (PSSMs) are widely used to represent sequence motifs. In this paper, we present a new nonheuristic algorithm, called ESAsearch, to efficiently find matches of such matrices in large databases. Our approach preprocesses the search space, e.g. a complete genome or a set of protein sequences, and builds an enhanced suffix array which is stored on file. The enhanced suffix array only requires 9 bytes per input symbol, and allows to search a database with a PSSM in sublinear expected time. We also address the problem of non-comparable PSSM-scores by developing a method which allows to efficiently compute a matrix similarity threshold for a PSSM, given an E-value or a p-value. Our method is based on dynamic programming. In contrast to other methods it employs lazy evaluation of the dynamic programming matrix: it only evaluates those matrix entries that are necessary to derive the sought similarity threshold. We tested algorithm ESAsearch with nucleotide PSSMs and with amino acid PSSMs. Compared to the best previous methods, ESAsearch show speedups of a factor between 4 and 50 for nucleotide PSSMs, and speedups up to a factor 1.8 for amino acid PSSMs. Comparisons with the most widely used programs even show speedups by a factor of at least 10. The lazy evaluation method is also much faster than previous methods, with speedups by a factor of at least 10.